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Preface 


General relativity is one of the cornerstones of classical physics, providing a 
synthesis of special relativity and gravitation, and is central to our understanding 
of many areas of astrophysics and cosmology. This book is intended to give an 
introduction to this important subject, suitable for a one-term course for advanced 
undergraduate or beginning graduate students in physics or in related disciplines 
such as astrophysics and applied mathematics. Some of the later chapters should 
also provide a useful reference for professionals in the fields of astrophysics and 
cosmology. 

It is assumed that the reader has already been exposed to special relativity and 
Newtonian gravitation at a level typical of early-stage university physics courses. 
Nevertheless, a summary of special relativity from first principles is given in 
Chapter 1, and a brief discussion of Newtonian gravity is presented in Chapter 7. 
No previous experience of 4-vector methods is assumed. Some background in 
electromagnetism will prove useful, as will some experience of standard vector 
calculus methods in three-dimensional Euclidean space. The overall level of math¬ 
ematical expertise assumed is that of a typical university mathematical methods 
course. 

The book begins with a review of the basic concepts underlying special rela¬ 
tivity in Chapter 1. The subject is introduced in a way that encourages from the 
outset a geometrical and transparently four-dimensional viewpoint, which lays the 
conceptual foundations for discussion of the more complicated spacetime geome¬ 
tries encountered later in general relativity. In Chapters 2-4 we then present a 
mini-course in basic differential geometry, beginning with the introduction of 
manifolds, coordinates and non-Euclidean geometry in Chapter 2. The topic of 
vector calculus on manifolds is developed in Chapter 3, and these ideas are 
extended to general tensors in Chapter 4. These necessary mathematical prelimi¬ 
naries are presented in such a way as to make them accessible to physics students 
with a background in standard vector calculus. A reasonable level of mathematical 
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Preface 


rigour has been maintained throughout, albeit accompanied by the occasional 
appeal to geometric intuition. The mathematical tools thus developed arc then 
illustrated in Chapter 5 by re-examining the familial - topic of special relativity in a 
more formal manner, through the use of tensor calculus in Minkowski spacetime. 
These methods are further illustrated in Chapter 6, in which electromagnetism is 
described as a field theory in Minkowski spacetime, serving in some respects as a 
‘prototype’ for the later discussion of gravitation. In Chapter 7, the incompatibility 
of special relativity and Newtonian gravitation is presented and the equivalence 
principle is introduced. This leads naturally to a discussion of spacetime curvature 
and the associated mathematics. The field equations of general relativity are then 
derived in Chapter 8, and a discussion of their general properties is presented. 

The physical consequences of general relativity in a wide variety of astrophys- 
ical and cosmological applications are discussed in Chapters 9-18. In particular, 
the Schwarzschild geometry is derived in Chapter 9 and used to discuss the physics 
outside a massive spherical body. Classic experimental tests of general relativity 
based on the exterior Schwarzschild geometry are presented in Chapter 10. The 
interior Schwarzschild geometry and non-rotating black holes are discussed in 
Chapter 11, together with a brief mention of Kruskal coordinates and wormholes. 
In Chapter 12 we introduce two non-vacuum spherically symmetric geometries 
with a discussion of relativistic stars and charged black holes. Rotating objects are 
discussed in Chapter 13, including an extensive discussion of the Ken - solution. In 
Chapters 14-16 we describe the application of general relativity to cosmology and 
present a discussion of the Friedmann-Robertson-Walker geometry, cosmologi¬ 
cal models and the theory of inflation, including the generation of perturbations 
in the early universe. In Chapter 17 we describe linearised gravitation and weak 
gravitational fields, in particular - drawing analogies with the theory of electromag¬ 
netism. The equations of linearised gravitation are then applied to the generation, 
propagation and detection of weak gravitational waves in Chapter 18. The book 
concludes in Chapter 19 with a brief discussion of classical field theory and the 
derivation of the field equations of electromagnetism and general relativity from 
variational principles. 

Each chapter concludes with a number of exercises that are intended to illumi¬ 
nate and extend the discussion in the main text. It is strongly recommended that 
the reader attempt as many of these exercises as time permits, as they should give 
ample opportunity to test his or her understanding. Occasionally chapters have 
appendices containing material that is not central to the development presented in 
the main text, but may nevertheless be of interest to the reader. Some appendices 
provide historical context, some discuss current astronomical observations and 
some give detailed mathematical derivations that might otherwise interrupt the 
flow of the main text. 



Preface xvii 

With regard to the presentation of the mathematics, it has to be accepted 
that equations containing partial and covariant derivatives could be written more 
compactly by using the comma and semi-colon notation, e.g. v“ b for the partial 
derivative of a vector and v a . b for its covariant derivative. This would certainly 
save typographical space, but many students find the labour of mentally unpacking 
such equations is sufficiently great that it is not possible to think of an equation’s 
physical interpretation at the same time. Consequently, we have decided to write 
out such expressions in their more obvious but longer form, using d b v a for partial 
derivatives and V b v a for covariant derivatives. 

It is worth mentioning that this book is based, in large paid, on lecture notes 
prepared separately by MPH and GPE for two different relativity courses in the 
Natural Science Tripos at the University of Cambridge. These courses were first 
presented in this form in the academic year 1999-2000 and arc still ongoing. The 
course presented by MPH consisted of 16 lectures to fourth-year undergraduates 
in Paid III Physics and Theoretical Physics and covered most of the material 
in Chapters 1-11 and 13-14, albeit somewhat rapidly on occasion. The course 
given by GPE consisted of 24 lectures to third-year undergraduates in Paid II 
Astrophysics and covered parts of Chapters 1, 5-11, 14 and 18, with an emphasis 
on the less mathematical material. The process of combining the two sets of 
lecture notes into a homogeneous treatment of relativistic gravitation was aided 
somewhat by the fortuitous choice of a consistent sign convention in the two 
courses, and numerous sections have been rewritten in the hope that the reader 
will not encounter any jarring changes in presentational style. For many of the 
topics covered in the two courses mentioned above, the opportunity has been 
taken to include in this book a considerable amount of additional material beyond 
that presented in the lectures, especially in the discussion of black holes. Some 
of this material draws on lecture notes written by ANL for other courses in Paid 
II and Paid III Physics and Theoretical Physics. Some topics that were entirely 
absent from any of the above lecture courses have also been included in the book, 
such as relativistic stars, cosmology, inflation, linearised gravity and variational 
principles. While every care has been taken to describe these topics in a clear and 
illuminating fashion, the reader should hear in mind that these chapters have not 
been ‘road-tested’ to the same extent as the rest of the book. 

It is with pleasure that we record here our gratitude to those authors from 
whose books we ourselves learnt general relativity and who have certainly 
influenced our own presentation of the subject. In particular, we acknowledge 
(in their current latest editions) S. Weinberg, Gravitation and Cosmology, 
Wiley, 1972; R. M. Wald, General Relativity, University of Chicago Press, 
1984; B. Schutz, A First Course in General Relativity, Cambridge Univer¬ 
sity Press, 1985; W. Rindler, Relativity: Special, General and Cosmological, 
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Oxford University Press, 2001; and J. Foster & J. D. Nightingale, A Short Course 
in General Relativity, Springer-Verlag, 1995. 

During the writing of this book we have received much help and encourage¬ 
ment from many of our colleagues at the University of Cambridge, especially 
members of the Cavendish Astrophysics Group and the Institute of Astronomy. 
In particular, we thank Chris Doran, Anthony Challinor, Steve Gull and Paul 
Alexander for numerous useful discussions on all aspects of relativity theory, and 
Dave Green for a great deal of advice concerning typesetting in LaTeX. We arc 
also especially grateful to Richard Sword for creating many of the diagrams and 
figures used in the book and to Michael Bridges for producing the plots of recent 
measurements of the cosmic microwave background and matter power spectra. 
We also extend our thanks to the Cavendish and Institute of Astronomy teach¬ 
ing staff, whose examination questions have provided the basis for some of the 
exercises included. Finally, we thank several years of undergraduate students for 
their careful reading of sections of the manuscript, for pointing out misprints and 
for numerous useful comments. Of course, any errors and ambiguities remaining 
ai'c entirely the responsibility of the authors, and we would be most grateful to 
have them brought to our attention. At Cambridge University Press, we arc very 
grateful to our editor Vince Higgs for his help and patience and to our copy-editor 
Susan Parkinson for many useful suggestions that have undoubtedly improved the 
style of the book. 

Finally, on a personal note, MPH thanks his wife, Becky, for patiently enduring 
many evenings and weekends spent listening to the sound of fingers tapping on 
a keyboard, and for her unending encouragement. He also thanks his mother, 
Pat, for her tireless support at every turn. MPH dedicates his contribution to this 
book to the memory of his father, Ron, and to his daughter, Tabitha, whose early 
arrival succeeded in delaying completion of the book by at least three months, but 
equally made him realise how little that mattered. GPE thanks his wife, Yvonne, 
for her support. ANL thanks all the students who have sat through his various 
lectures on gravitation and cosmology and provided useful feedback. He would 
also like to thank his family, and particularly his parents, for the encouragement 
and support they have offered at all times. 



1 

The spacetime of special relativity 


We begin our discussion of the relativistic theory of gravity by reviewing some 
basic notions underlying the Newtonian and special-relativistic viewpoints of 
space and time. In order to specify an event uniquely, we must assign it three 
spatial coordinates and one time coordinate, defined with respect to some frame 
of reference. For the moment, let us define such a system S by using a set of three 
mutually orthogonal Cartesian axes, which gives us spatial coordinates x, y and 
z, and an associated system of synchronised clocks at rest in the system, which 
gives us a time coordinate t. The four coordinates {t, x, y, z) thus label events in 
space and time. 


1.1 Inertial frames and the principle of relativity 

Clearly, one is free to label events not only with respect to a frame S but also 
with respect to any other frame S', which may be oriented and/or moving with 
respect to S in an arbitrary manner. Nevertheless, there exists a class of preferred 
reference systems called inertial frames, defined as those in which Newton’s first 
law holds, so that a free particle is at rest or moves with constant velocity, i.e. in 
a straight line with fixed speed. In Cartesian coordinates this means that 


d 2 x dry d 2 z 
dt 2 dt 2 dt 2 


It follows that, in the absence of gravity, if S and S' arc two inertial frames then 
S' can differ from S only by (i) a translation, and/or (ii) a rotation and/or (in) a 
motion of one frame with respect to the other at a constant velocity (for otherwise 
Newton’s first law would no longer be true). The concept of inertial frames is 
fundamental to the principle of relativity, which states that the laws of physics 
take the same form in every inertial frame. No exception has ever been found to 
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2 The spacetime of special relativity 

this general principle, and it applies equally well in both Newtonian theory and 
special relativity. 

The Newtonian and special-relativistic descriptions differ in how the coor¬ 
dinates of an event P in two inertial frames arc related. Let us consider two 
Cartesian inertial frames S and S' in standard configuration, where S' is moving 
along the x-axis of 5 at a constant speed v and the axes of S and S' coincide at 
t — t' — 0 (see Figure 1.1). It is clear - that the (primed) coordinates of an event 
P with respect to S' are related to the (unprimed) coordinates in S via a linear 
transformation 1 of the form 


t — At T Bx, 
x = Dt + Ex, 

y' = y, 

z' = z. 

Moreover, since we require that x' = 0 corresponds to x = vt and that x = 0 
corresponds to x! = — vt 1 , we find immediately that D = —Ev and D = — Av , so 
that A = E. Thus we must have 

f 
x' 

y 


At + Bx, 

■ A(x — Vt), 

■■y, 

■ z. 


( 1 . 1 ) 



Figure 1.1 Two inertial frames S and S' in standard configuration (the origins 
of S and S' coincide at t — t' = 0). 


We will prove this in Chapter 5. 




1.3 The spacetime geometry of special relativity 

1.2 Newtonian geometry of space and time 
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Newtonian theory rests on the assumption that there exists an absolute time, which 
is the same for every observer, so that t' = t. Under this assumption A = 1 and 
B = 0, and we obtain the Galilean transformation relating the coordinates of an 
event P in the two Cartesian inertial frames S and S': 


( 1 . 2 ) 


By symmetry, the expressions for the unprimed coordinates in terms of the primed 
ones have the same form but with v replaced by —v. 

The first equation in (1.2) is clearly valid for any two inertial frames S and 
S' and shows that the time coordinate of an event P is the same in all inertial 
frames. The second equation leads to the ‘common sense’ notion of the addition 
of velocities. If a particle is moving in the x-direction at a speed u in 5 then its 
speed in S' is given by 



dx' 


dx 




— v = u r — V. 


dx' 

dt' dt dt 

Differentiating again shows that the acceleration of a particle is the same in both 
S and S', i.e. du' x /dt' = du x /dt. 

If we consider two events A and B that have coordinates (t A , x A , y A , z A ) 
and ( t B , x B , yif , z B ) respectively, it is straightforward to show that both the time 
difference A t = t B — t A and the quantity 


A r~ = Ax 2 + Ay 2 + A z 2 


are separately invariant under any Galilean transformation. This leads us to 
consider space and time as separate entities. Moreover, the invariance of Ar 2 
suggests that it is a geometric property of space itself. Of course, we recognise 
Ar 2 as the square of the distance between the events in a three-dimensional 
Euclidean space. This defines the geometry of space and time in the Newtonian 
picture. 


1.3 The spacetime geometry of special relativity 

In special relativity, Einstein abandoned the postulate of an absolute time and 
replaced it by the postulate that the speed of light c is the same in all inertial 
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frames. 2 By applying this new postulate, together with the principle of relativity, 
we may obtain the Lorentz transformations connecting the coordinates of an event 
P in two different Cartesian inertial frames S and S'. 

Let us again consider S and S' to be in standard configuration (see Figure 1.1), 
and consider a photon emitted from the (coincident) origins of S and S' at / = 
t' — 0 and travelling in an arbitrary direction. Subsequently the space and time 
coordinates of the photon in each frame must satisfy 


c 2 t 2 - x 2 - y 2 -z 2 = c 2 t'~ - x' - y' — z! = 0. 


Substituting the relations (1.1) into this expression and solving for the constants 
A and B, we obtain 



where /3 = v/c and y = (1 — /3 2 )" ' l,/2 . This Lorentz transformation, also known 
as a boost in the .^-direction, reduces to the Galilean transformation (1.2) when 
/3 <5C 1. Once again, symmetry demands that the unprimed coordinates are given 
in terms of the primed coordinates by an analogous transformation in which v is 
replaced by —v. 

From the equations (1.3), we see that the time and space coordinates arc in 
general mixed by a Lorentz transformation (note, in particular, the symmetry 
between ct and x). Moreover, as we shall see shortly, if we consider two events 
A and B with coordinates (t A , x A . y A , z. A ) and ( t B , x B , y B , z B ) in S. it is straight¬ 
forward to show that the interval (squared) 

(1.4) 

is invariant under any Lorentz transformation. As advocated by Minkowski, these 
observations lead us to consider space and time as united in a four-dimensional 
continuum called spacetime , whose geometry is characterised by (1.4). We note 
that the spacetime of special relativity is non-Euclidean, because of the minus 
signs in (1.4), and is often called the pseudo-Euclidean or Minkowski geometry. 
Nevertheless, for any fixed value of t the spatial paid of the geometry remains 
Euclidean. 



The reasoning behind Einstein’s proposal is discussed in Appendix 1A. 





1.4 Lorentz transformations as four-dimensional ‘rotations’ 
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We have arrived at the familial - viewpoint (to a physicist!) where the physical 
world is modelled as a four-dimensional spacetime continuum that possesses 
the Minkowski geometry characterised by (1.4). Indeed, many ideas in special 
relativity are most simply explained by adopting a four-dimensional point of view. 

1.4 Lorentz transformations as four-dimensional ‘rotations’ 

Adopting a particular (Cartesian) inertial frame S corresponds to labelling events in 
the Minkowski spacetime with a given set of coordinates ( t , x, y, z). If we choose 
instead to describe the world with respect to a different Cartesian inertial frame 
S' then this corresponds simply to relabelling events in the Minkowski spacetime 
with a new set of coordinates (f, x! , y', z')‘, the primed and unprimed coordinates 
are related by the appropriate Lorentz transformation. Thus, describing physics 
in terms of different inertial frames is equivalent to performing a coordinate 
transformation on the Minkowski spacetime. 

Consider, for example, the case where S' is related to S via a spatial rotation 
through an angle 6 about the x-axis. In this case, we have 

ct' = ct, 

/ / 

X = X , 

y' = yc os 0 —z sin 0, 
z' = ysin0 + 4cos0. 

Clearly the inverse transform is obtained on replacing 6 by —6. 

The close similarity between the ‘boost’ (1.3) and an ordinary spatial rotation 
can be highlighted by introducing the rapidity parameter 

(p = tanh” 1 (3. 

As [3 varies from zero to unity, ip ranges from 0 to oo. We also note that y = cosh ip 
and yf3 = sinh ip. If two inertial frames S and S' are in standard configuration, we 
therefore have 



This has essentially the same form as a spatial rotation, but with hyperbolic 
functions replacing trigonometric ones. Once again the inverse transformation is 
obtained on replacing ip by — ip. 
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Figure 1.2 Two inertial frames S and S' in general configuration. The broken 
line shown the trajectory of the origin of S'. 

In general. S' is moving with a constant velocity v with respect to S in an 
arbitrary direction 3 and the axes of S' are rotated with respect to those of S. 
Moreover, at t — t' — 0 the origins of S and S' need not be coincident and may 
be separated by a vector displacement a, as measured in S (see Figure 1.2). 4 
The corresponding transformation connecting the two inertial frames is most 
easily found by decomposing the transformation into a displacement, followed 
by a spatial rotation, followed by a boost, followed by a further spatial rotation. 
Physically, the displacement makes the origins of S and S' coincident at t = t' = 0, 
and the first rotation lines up the x-axis of S with the velocity v of S'. Then a boost 
in this direction with speed v transforms S into a frame that is at rest with respect to 
S'. A final rotation lines up the coordinate frame with that of S'. The displacement 
and spatial rotations introduce no new physics, and the only special-relativistic 
consideration concerns the boost. Thus, without loss of generality, we can restrict 
our attention to inertial frames S and S' that are in standard configuration, for 
which the Lorentz transformation is given by (1.3) or (1.5). 


1.5 The interval and the lightcone 

If we consider two events A and B having coordinates (t' A , x' A , y' A , z' A ) and 
( t' B , x' B , y' B , z' B ) in S ', then, from (1.5), the interval between the events is given by 


3 Throughout this book, the notation v is used specifically to denote three-dimensional vectors, whereas v 
denotes a general vector, which is most often a 4-vector. 

4 If a = 0 then the Lorentz transformation connecting the two inertial frames is called homogeneous , while if 
a f=- 0 it is called inhomogeneous. Inhomogeneous transformations are often referred to as Poincare transfor¬ 
mations, in which case homogeneous transformations are referred to simply as Lorentz transformations. 



1.5 The internal and the lightcone 
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A s 2 = c 2 At' 2 — Ax'" — A/ 2 — Ay 2 

= [(cAf) cosh iff — (Ax) sinh if /] 2 — [—(cAt) sinh ift + (Ax) cosh i [t] 2 
- Ay 2 - Az 2 

= c 2 A? 2 - Ax 2 - A y 2 - Az 2 . 

Thus the interval is invariant under the boost (1.5) and, from the above discussion, 
we may infer that A.v 2 is in fact invariant under any Poincare transformation. This 
suggests that the interval is an underlying geometrical property of the spacetime 
itself, i.e. an invariant ‘distance’ between events in spacetime. It also follows that 
the sign of A.y 2 is defined invariantly, as follows: 


for As 2 > 0, the interval is timelike; 

for A.v 2 = 0, the interval is null or lightlike; 

for As 2 < 0, the interval is spacelike. 


This embodies the standard lightcone structure shown in Figure 1.3. Events A and 
B arc separated by a timelike interval, A and C by a lightlike (or null) interval and 


‘Elsewhere’ of A 


• A 


‘Elsewhere’ of A 


x 


Past of A 


Figure 1.3 Spacetime diagram illustrating the lightcone of an event A (the y- 
and z- axes have been suppressed). Events A and B are separated by a timelike 
interval, A and C by a lightlike (or null) interval and A and D by a spacelike 
interval. 
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A and D by a spacelike interval. The geometrical distinction between timelike and 
spacelike intervals corresponds to a physical distinction: if the interval is timelike 
then we can find an inertial frame in which the events occur at the same spatial 
coordinates and if the interval is spacelike then we can find an inertial frame 
in which the events occur at the same time coordinate. This becomes obvious 
when we consider the spacetime diagram of a Lorentz transformation; we shall 
do this next. 


1.6 Spacetime diagrams 

Figure 1.3 is an example of a spacetime diagram. Such diagrams are extremely 
useful in illustrating directly many special-relativistic effects, in particular coor¬ 
dinate transformations on the Minkowski spacetime between different inertial 
frames. The spacetime diagram in Figure 1.4 shows the change of coordinates of 
an event A corresponding to the standard-configuration Lorentz transformation 
(1.5). The x'-axis is simply the line t' — 0 and the r'-axis is the line x = 0. 
From the Lorentz-boost transformation (1.3) we see that the angle between the 
x- and x'- axes is the same as that between the t- and t'- axes and has the value 


ct ct' 



Figure 1.4 Spacetime diagram illustrating the coordinate transformation 
between two inertial frames S and S' in standard configuration (the y- and z- 
axes have been suppressed). The worldlines of the origins of S and S' are the 
axes ct and ct' respectively. 





1.6 Spacetime diagrams 


tan -1 (t> /c). Moreover, we note that the t- and t'- axes arc also the worldlines of 
the origins of S and S' respectively. 

It is important to realise that the coordinates of the event A in the frame S' arc 
not obtained by extending perpendiculars from A to the x'- and t'- axes. Since 
the x'-axis is simply the line t' = 0, it follows that lines of simultaneity in S' arc 
parallel to the x'-axis. Similarly, lines of constant x' arc parallel to the /'-axis. The 
same reasoning is equally valid for obtaining the coordinates of A in the frame 
S but, since the x- and t- axes are drawn as orthogonal in the diagram, this is 
equivalent simply to extending perpendiculars from A to the x- and t- axes in the 
more familial - manner. 

The concept of simultaneity is simply illustrated using a spacetime diagram. 
For example, in Figure 1.5 we replot the events in Figure 1.3, together with the x!- 
and t'- axes corresponding to a Lorentz boost in standard configuration at some 
velocity v. We see that the events A and D, which are separated by a spacelike 
interval, lie on a line of constant t' and so are simultaneous in S'. Evidently, A 
and D are not simultaneous in .S’; D occurs at a later time than A. In a similar 
way, it is straightforward to find a standard-configuration Lorentz boost such that 
the events A and 5, which are separated by a timelike interval, lie on a line of 
constant x' and hence occur at the same spatial location in S'. 



Figure 1.5 The events illustrated in figure 1.3 and a Lorentz boost such that A 
and D are simultaneous in S'. 
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1.7 Length contraction and time dilation 

Two elementary (but profound) consequences of the Lorentz transformations 
arc length contraction and time dilation. Both these effects arc easily derived 
from (1.3). 


Length contraction 

Consider a rod of proper length l 0 at rest in S' (see Figure 1.6); we have 

1 0 = x'b-x' a . 

We want to apply the Lorentz transformation formulae and so find what length 
an observer in frame S assigns to the rod. Applying the second formula in (1.3), 
we obtain 


x'a = y (x A -vt A ), 

4 = 7 ( x b ~ vt B ), 


relating the coordinates of the ends of the rod in S' to the coordinates in S. The 
observer in S measures the length of the rod at & fixed time t = t A = t B as 


i = X B - X A = - ( x' B - x' A ) = 
Hence in S the rod appeal's contracted to the length 


4 

7 ’ 


i = e 0 (i-v 2 ic 2 ) 


2 \ 1/2 


If a rod is moving relative to S in a direction perpendicular to its length, 
however, it is straightforward to show that it suffers no contraction. It thus follows 
that the volume V of a moving object, as measured by simultaneously noting the 
positions of the boundary points in 5, is related to its proper volume V {) by V = 
Vq( 1 — u 2 /c 2 ) l,/2 . This fact must be taken into account when considering densities. 



Figure 1.6 Two inertial frames S and S' in standard configuration. A rod of 
proper length l 0 is at rest in S'. 





1.8 Invariant hyperbolae 
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Figure 1.7 Two inertial frames S and S' in standard configuration. A clock is 
at rest in S'. 


Time dilation 

Suppose we have a clock at rest in S', in which two successive ‘clicks’ of the 
clock (events A and B) are separated by a time interval T 0 (see Figure 1.7). The 
times of the clicks as recorded in S are 

f A = y (4 + vx’jc 2 ), 

t B = y ( t' A + T 0 + vx' B /c 2 ) . 

Since the clock is at rest in S' we have x' A = x' B , and so on subtracting we obtain 
T = t B -t A = yT 0 = (1 _ j; 2 / c 2 )i/ 2 - 

Hence, the moving clock ticks more slowly by a factor of (1 — u 2 /c 2 ) 1//2 (time 
dilation). 

Note that an ideal clock is one that is unaffected by acceleration - external 
forces act identically on all parts of the clock (an example is a muon). 


1.8 Invariant hyperbolae 

Length contraction and time dilation are easily illustrated using spacetime 
diagrams. However, while Figure 1.4 illustrates the positions of the x!- and t 1 - axes 
corresponding to a standard Lorentz boost, we have not yet calibrated the length 
scales along them. To perform this calibration, we make use of the fact that the 
interval A s 2 between two events is an invariant, and draw the invariant hyperbolae 

c 2 t 2 — x 2 = c 2 t' 2 — x' 2 = ±1 

on the spacetime diagram, as shown in Figure 1.8. Then, if we first take the 
positive sign, setting ct = 0, we obtain x = ±1. It follows that OA is a unit 
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line of 



Figure 1.8 The invariant hyperbolae c 2 t 2 — x 2 — c 2 t' 2 — x' 2 — ±1. 


distance along the x-axis. Now setting ct' = 0 we find that x! = ±1, so that OC is 
a unit distance along the x'-axis. Similarly, OB and Of) are unit distances along 
the t- and t'- axes respectively. We also note that the tangents to the invariant 
hyperbolae at C and D are lines of constant x and t' respectively. 

The length contraction and time dilation effects can now be read off directly 
from the diagram. For example, the worldlines of the end-points of a unit rod 
OC in S', namely x' = 0 and x! — 1, cut the x-axis in less than unit distance. 
Similarly, worldlines x = 0 and x = 1 in S cut the x'-axis inside OC, illustrating 
the reciprocal nature of length contraction. Also, a clock at rest at the origin of 
S' will move along the /'-axis, reaching D with a reading of t' — 1. However, the 
event D has a /-coordinate that is greater than unity, thereby illustrating the time 
dilation effect. 


1.9 The Minkowski spacetime line element 

Let consider more closely the meaning of the interval between two events A and 
B in spacetime. Given that in a particular inertial frame S the coordinates of A 
and B are (t A , x A , y A , z, A ) and (t B , x B , y B , z B ), we have so far taken the square of 
the interval between A and B to be 

A s 2 = c 1 A.t 2 — Ax 2 — Ay 2 — Az 2 , 



1.9 The Minkowski spacetime line element 


13 


ct 



X 


Figure 1.9 Two paths in spacetime connecting the events A and B. 

where A t = t B — t A etc. This interval is invariant under Lorentz transformation 
and corresponds to the ‘distance’ in spacetime measured along the straight line 
in Figure 1.9 connecting A and B. This line may be interpreted as the worldline 
of a particle moving at constant velocity relative to S between events A and B. 
However, the question naturally arises of what interval is measured between A 
and B along some other path in spacetime, for example the ‘wiggly’ path shown 
in Figure 1.9. 

To address this question, we must express the intrinsic geometry of the 
Minkowski spacetime in infinitesimal form. Clearly, if two infinitesimally sepa¬ 
rated events have coordinates {t, x, y, z) and (t + dt,x + dx, y + dy, z + dz) in S 
then the square of the infinitesimal interval between them is given by 5 

ds 2 = c 2 dt 2 — dx 2 — dy 2 — dz 2 , 

which is known as the line element of Minkowski spacetime, or the special- 
relativistic line element. From our earlier considerations, it is clear that ds 2 is 
invariant under any Lorentz transformation. The invariant interval between A and 
B along an arbitrary path in spacetime is then given by 

r B 

As = / ds, 

J A 


5 To avoid mathematical ambiguity, one should properly denote the squares of infinitesimal coordinate intervals 
by ( dt ) 2 etc., but this notation is not in common use in relativity textbooks. We will thus adopt the more 
usual form dt 2 , but it should be remembered that this is not the differential of f 2 . 
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where the integral is evaluated along the particular path under consideration. 
Clearly, to perform this integral we must have a set of equations describing the 
spacetime path. 


1.10 Particle worldlines and proper time 

Let us now turn to the description of the motion of a particle in spacetime terms. 
A particle describes a worldline in spacetime. In general, for two infinitesimally 
separated events in spacetime; by analogy with our earlier discussion we have: 

for ds 2 > 0, the interval is timelike; 

for ds 2 = 0, the interval is null or lightlike; 

for ds 2 < 0, the interval is spacelike. 

However, relativistic mechanics prohibits the acceleration of a massive particle 
to speeds greater than or equal to c, which implies that its worldline must lie 
within the lightcone (Figure 1.3) at each event on it. In other words, the interval 
between any two infinitesimally separated events on the particle’s worldline must 
be timelike (and future-pointing). For a massless particle such as a photon, any 
two events on its worldline are separated by a null interval. Figure 1.10 illustrates 
general worldlines for a massive particle and for a photon. 


Ct 



Figure 1.10 The worldlines of a photon (solid line) and a massive particle 
(broken line). The lightcones at seven events are shown. 



1.10 Particle worldlines and proper time 
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A particlc worldline may be described by giving x, y and z as functions of t 
in some inertial frame S. However, a more four-dimensional way of describing a 
worldline is to give the four coordinates (t, x, y, z) of the particle in S as functions 
of a parameter A that varies monotonically along the worldline. Given the four 
functions f(A), x(A), y(A) and z(A), each value of A determines a point along 
the curve. Any such parameter is possible, but a natural one to use for a massive 
particle is its proper time. 

We define the proper time interval dr between two infinitesimally separated 
events on the part iclc’s worldline by 

c 2 dT 2 = ds 2 . (1.6) 

Thus, if the coordinate differences in S between the two events arc dt, dx, dy, dz 
then we have 

c 2 dr 2 = c 2 dt 2 — dx 2 — dy 2 — dz 2 - 
Hence the proper time interval between the events is given by 


dr = (1 — v 2 /c 2 ) l / 2 dt = dt/y v , 


where v is the speed of the particle with respect to S over this infinitesimal 
interval. If we integrate dr between two points A and B on the worldline, we 
obtain the total elapsed proper time interval: 



We see that if the particle is at rest in S then the proper time t is just the 
coordinate time t measured by clocks at rest in S. If at any instant in the history 
of the particle we introduce an instantaneous rest frame S' such that the particle 
is momentarily at rest in S' then we see that the proper time t is simply the 
time recorded by a clock that moves along with the particle. It is therefore an 
invariantly defined quantity, a fact that is clear from (1.6). 

Thus the worldline of a massive particle can be described by giving the four 
coordinates ( t , x, y, z.) as functions of t (see Figure 1.11). For example, 

r = t(1 - n 2 /c 2 ) _1/2 , 
x = ut(1 — v 2 /c 2 ) -1 / 2 , 
y — z — 0 
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Figure 1.11 A path in the ( t , x)-plane can be specified by giving one coordinate 
in terms of the other, for example x — x(f), or alternatively by giving both 
coordinates as functions of a parameter A along the curve: t — /(A), x = x(A). 
For massive particles the natural parameter to use is the proper time t. 


is the worldline of a particle, moving at constant speed v along the x-axis of 5, 
which passes through the origin of S at t = 0. 


1.11 The Doppler effect 

A useful illustration of particle worldlines and the concept of proper time is 
provided by deriving the Doppler effect in a transparently four-dimensional 
manner. Let us consider an observer 0 at rest in some inertial frame S, and a 
radiation-emitting source £ moving along the positive x-axis of S at a uniform 
speed v. Suppose that the source emits the first wavecrest of a photon at an 
event A, with coordinates (t e ,x e ) in .S', and the next wavecrest at an event B 
with coordinates (f e + Af e , x e + Ax e ). Let us assume that these two wavecrests 
reach the observer at the events C and D coordinates (t 0 ,x 0 ) and (t a + Af 0 , x 0 ) 
respectively. This situation is illustrated in Figure 1.12. From (1.7), the proper 
time interval experienced by £ between the events A and B is 

At ab = (1 —u 2 /c 2 ) 1/2 At e , (1.8) 

and the proper time interval experienced by 0 between the events C and D is 

At CZ) = A t 0 . 


(1.9) 



1.11 The Doppler effect 
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Figure 1.12 Spacetime diagram of the Doppler effect. 


Along each of the worldlines representing the photon wavecrests, 
ds 2 = c 1 dt 2 — dx 2 — dy 2 — dz 2 = 0. 


Thus, since we are assuming that dy = dz = 0, along the worldline connecting 
the events A and C we have 


r 


c dt = 



( 1 . 10 ) 


where the minus sign on the right-hand side arises because the photon is travelling 
in the negative x-direction. From (1.10), we obtain the (obvious) result c{t Q — t e ) = 
— (x 0 — x e ). Similarly, along the worldline connecting B and D we have 

tP X Q 

/ cdt = — dx. 

Jt e +At c ^x e +Ax e 

Rewriting the integrals on each side, we obtain 



where the first integrals on each side of the equation cancel by virtue of (1.10). 
Thus we find that cAt 0 — cAf e = Ax e , from which we obtain 


At,, = 



A t e . 


( 1 . 11 ) 
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Hence, using (1.8), (1.9) and (1.11), we can derive the ratio of the proper time 
intervals At cd and At as experienced by 0 and £ respectively: 

At cd _ (1 + /3)Af e _ 1 + /3 _ (1 + /3) 1/2 

At ab (1 —/3 2 )t/ 2 Ar e (1-^1/2(1 + jS )i/2 (1 -yS)V2* 

This ratio must be the reciprocal of the ratio of the photon’s frequency as measured 
by £ and 0 respectively, and thus we obtain the familial - Doppler-effect formula 

( 1 . 12 ) 


1.12 Addition of velocities in special relativity 

If a particle’s worldline is described by giving x, y and z as functions of t in 
some inertial frame S then the components of its velocity in S at any point are 

dx dy dz 

U x — ~r ~, U = —, U 

dt dt ~ dt 

The components of its velocity in some other inertial frame S' are usually obtained 
by taking differentials of the Lorentz transformation. For inertial frames S and S' 
related by a boost v in standard configuration, we have from (1.3) 

dt' = y v (dt — vdx/c 2 ), dx' = y v (dx — vdt), dy' = dy, dz' = dz, 

where we have made explicit the dependence of y on v. We immediately obtain 


(1.13) 


These replace the ‘common sense' addition-of-velocities formulae of Newtonian 
mechanics. The inverse transformations are obtained by replacing v by —v. 

The special-relativistic addition of velocities along the same direction is 
elegantly expressed using the rapidity parameter (Section 1.4). For example, 
consider three inertial frames S, S' and S". Suppose that S' is related to .S’ by a 
boost of speed v in the x-direction and that S'' is related to S' by a boost of speed 
u' in the x'-direction. Using (1.5), we quickly find that 

ct" = ct cosh (i/f„ + \fj u t) — xsmh(ift v + if/ u ,), 







1.13 Acceleration in special relativity 
x" = — ci sinh (i/q, + ) +x cosh (i/q,+ (/>„,), 
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where tanh iJj v = v/c and tanh ifj u , = u /c. This shows that S" is connected to S 
by a boost in the x-direction with speed u, where u/c — tanh ( iJj i: + 1 Thus we 
simply add the rapidities (in a similar way to adding the angles of two spatial 
rotations about the same axis). This gives 



which is the special-relativistic formula for the addition of velocities in the same 
direction. 


1.13 Acceleration in special relativity 

The components of the acceleration of a particle in S arc defined as 


and the corresponding quantities in S' arc obtained from the differential forms of 
the expressions (1.13). For example. 


U Li — —---—— . 

' 7v( l ~ u xV/c~) 2 

Also, from the Lorentz transformation (1.3) we find that 

dt' = y V (dt — vdx/c 2 ) = y v (l — u x v/c 2 ) dt. 


So, for example, we have 

(1.14) 

Similarly, we obtain 








20 


The spacetime of special relativity 


We see from these transformation formulae that acceleration is not invariant 
in special relativity, unlike in Newtonian mechanics, as discussed in Section 1.2. 
However, it is clear that acceleration is an absolute quantity, that is, all observers 
agree upon whether a body is accelerating. If the acceleration is zero in one 
inertial frame, it is necessarily zero in any other frame. 

Let us investigate the worldline of an accelerated particle. To make our illus¬ 
tration concrete, we consider a spaceship moving at a variable speed u(t) relative 
to some inertial frame S and suppose that an observer B in the spaceship makes 
a continuous record of his accelerometer reading /(r) as a function of his own 
proper time r. 

We begin by introducing an instantaneous rest frame (IRF) S', which, at each 
instant, is an inertial frame moving at the same speed v as the spaceship, i.e. v — u. 
Thus, at any instant, the velocity of the spaceship in the IRF S' is zero, i.e. u' = 0. 
Moreover, from the above discussion of proper time, it should be clear that at any 
instant an interval of proper time is equal to an interval of coordinate time in the 
IRF, i.e. St = St'. An accelerometer measures the rate of change of velocity, so 
that, during a small interval of proper time St, B will record that his velocity has 
changed by an amount /(t)St. Therefore, at any instant, in the IRF S' we have 


From (1.14), we thus obtain 


du' 

dt' 


du' 

dT 


= /(t)- 


du / 
dt \ 

However, since dT = (1 — u 2 /c 2 )^ 2 
du 
dT 

which integrates easily to give 


u 


2 \ 3/2 


1 - W ) /( T )- 


dt, we find that 


/( t )> 


u(t) = ctanhi//(T), 


where c(//(r) = / () T /( t ) dr and we have taken u(t = 0) to be zero. Thus we have 
an expression for the velocity of the spaceship in S as a function of B’s proper time. 
To parameterise the worldline of the spaceship in S, we note that 

- 1/2 


dT 

dx 
dT 


= cosh i/r(x), 


2 \ — 1/2 


= csinh !^(t) . 


(1.15) 


Integration of these equations with respect to r gives the functions f(r) and x(r). 



1.14 Event horizons in special relativity 

1.14 Event horizons in special relativity 
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The presence of acceleration can produce surprising effects. Consider for simplic¬ 
ity the case of uniform acceleration. By this we mean we do not mean that 
du/dt = constant, since this is inappropiate in special relativity because it would 
imply that u -> oo as t —»■ oo, which is not permitted. Instead, uniform acceler¬ 
ation in special relativity means that the accelerometer reading /(t) is constant. 
A spaceship whose engine is set at a constant emission rate would be uniformly 
accelerated in this sense. 

Thus, if / = constant, we have f = fr/c. The equations (1-15) are then easily 
integrated to give 


c . /t 
t = tn- 1— smh —, 


/ 




x = Xq H-I cosh-1 1 , 


where t 0 and x 0 are constants of integration. Setting l {] = x {) = 0 gives the path 
shown in Figure 1.13. The worldline takes the form of a hyperbola. 

Imagine that an observer B has the resources to maintain an acceleration / 
indefinitely. Then there will be events that B will never be able to observe. 
The events in question lie on the future side of the asymptote to B 's hyperbola; 
this asymplote (which is a null line) is the event horizon of B. Objects whose 
worldlines cross this horizon will disappear - from B’s view and will seem to take 


ct 



Figure 1.13 The worldline of a uniformly accelerated particle B starting from 
rest from the origin of S. If an observer A remains at x — 0, then the worldline 
of A is simply the r-axis. No message sent by A after t = cjf will ever reach B. 
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for ever to do so. Nevertheless, the objects themselves cross the horizon in a finite 
proper time and still have an infinite lifetime ahead of them. 


Appendix 1A: Einstein’s route to special relativity 

Most books on special relativity begin with some sort of description of the 
Michelson-Morley experiment and then introduce the Lorentz transformation. In 
fact, Einstein claimed that he was not influenced by this experiment. This is 
disputed by various historians of science and biographers of Einstein. One might 
think that these scholars arc on strong ground, especially given that the experiment 
is referred to (albeit obliquely) in Einstein’s papers. However, it may be worth 
taking Einstein’s claim at face value. 

Remember that Einstein was a theorist - one of the greatest theorists who has 
ever lived - and he had a theorist’s way of looking at physics. A good theorist 
develops an intuition about how Nature works, which helps in the formulation 
of physical laws. For example, possible symmetries and conserved quantities arc 
considered. We can get a strong clue about Einstein’s thinking from the title of 
his famous 1905 paper on special relativity. The first paragraph is reproduced 
below. 


On the Electrodynamics of Moving Bodies 
by A. Einstein 

It is known that Maxwell’s electrodynamics - as usually understood at the present time - 
when applied to moving bodies, leads to asymmetries which do not appear to be inherent 
in the phenomena. Take, for example, the reciprocal electrodynamic action of a magnet 
and a conductor. The observable phenomenon here depends only on the relative motion 
of the conductor and the magnet, whereas the customary view draws a sharp distinction 
between the two cases in which either the one or the other of these bodies is in motion. 
For if the magnet is in motion and the conductor at rest, there arises in the neighbourhood 
of the magnet an electric field with a certain definite energy, producing a current at the 
places where parts of the conductor are situated. But if the magnet is stationary and 
the conductor in motion, no electric field arises in the neighbourhood of the magnet. In 
the conductor, however, we find an electromotive force, to which in itself there is no 
corresponding energy, but which gives rise - assuming equality of relative motion in the 
two cases discussed - to electric currents of the same path and intensity as those produced 
by the electric forces in the former case. 

You see that Einstein’s paper is not called ‘Transformations between inertial 
frames’, or ‘A theory in which the speed of light is assumed to be a universal 
constant’. Electrodynamics is at the heart of Einstein’s thinking; Einstein realized 
that Maxwell’s equations of electromagnetism required special relativity. 
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Maxwell’s equations arc 

V • D = p, V • 5 = 0, 

- - <55 - - - dD 

Vx£= -, V x H — j q-, 

dt J dt 

where D = € (j E + P and 5 = /x 0 (H + M), P and M being respectively the polari- 
sation and the magnetisation of the medium in which the fields are present. In free 
space we can set j = 0 and p = 0, and we then get the more obviously symmetrical 
equations 

V • 5 = 0, V • 5 = 0, 

- - <95 - - BE 

Vx£ = '¥’ ' <8=w *? 

Taking the curl of the equation for Vx£, applying the relation 
V x (V x E) = V(V • E) - V 2 E 

and performing a similar operation for 5 in the equation for V x 5, we derive the 
equations for electromagnetic waves: 

d 2 E d 2 B 

V-5 = /r () e 0 —, V-5 = /r 0 e 0 ^. 

These both have the form of a wave equation with a propagation speed c = 
1/V/W Now, the constants p, 0 and e 0 are properties of the ‘vacuum’: 

p, 0 , the permeability of a vacuum, equals 477 x lO^Hm -1 , 
e 0 , the permittivity of a vacuum, equals 8.85 x KT^Fnr 1 . 

This relation between the constants e 0 and p, () and the speed of light was one of 
the most startling consequences of Maxwell’s theory. But what do we mean by a 
‘vacuum’? Does it define an absolute frame of rest? If we deny the existence of an 
absolute frame of rest then how do we formulate a theory of electromagnetism? 
How do Maxwell’s equations appear in frames moving with respect to each other? 
Do we need to change the value of cl If we do, what will happen to the values 
of e 0 and p, 0 ? 

Einstein solves all of these problems at a stroke by saying that Maxwell’s 
equations take the same mathematical form in all inertial frames. The speed of light 
c is thus the same in all inertial frames. The theory of special relativity (including 
amazing conclusions such as E = me 2 ) follows from a generalisation of this 
simple and theoretically compelling assumption. Maxwell’s equations therefore 
require special relativity. You see that for a master theorist like Einstein, the 
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Michelson-Morley experiment might well have been a side issue. Einstein could 
‘see’ special relativity lurking in Maxwell’s equations. 


Exercises 

1.1 For two inertial frames S and S' in standard configuration, show that the coordinates 
of any given event in each frame are related by the Lorentz tranformations (1.3). 

1.2 Two events A and B have coordinates ( t A , x A , y A , z A ) and ( t B , x B , y B , z B ) respectively. 
Show that both the time difference A t = t B — t A and the quantity 

Ar = Ax 2 + Ay 2 + A z 2 

are separately invariant under any Galilean transformation, whereas the quantity 

As 2 = c 2 Ar 2 - Ax 2 - Ay 2 - A z 2 

is invariant under any Lorentz transformation. 

1.3 In a given inertial frame two particles are shot out simultaneously from a given 
point, with equal speeds v in orthogonal directions. What is the speed of each particle 
relative to the other? 

1.4 An inertial frame S' is related to S by a boost of speed v in the x-direction, and S" 
is related to S' by a boost of speed u! in the x'-direction. Show that S" is related to 
S by a boost in the x-direction with speed w, where 

u — c tanh (< ft v + ifj u ,)\ 

tanh ( jj v = v/c and tanh if u , — u'/c. 

1.5 An inertial frame S' is related to S by a boost v whose components in S are ( v x , v y , v z ). 
Show that the coordinates (cl', x', y', z') and (ct, x, y, z) of an event are related by 


(cA 


( y 

—yfix 

-yPy 

-yPz N 

( ct^ 

x' 


-yP x 

1 + a/3 2 

a PxPv 

VPxPz 

X 

i 


-yP y 

a/3 v /3 r 

l + a/3 2 

a/3,,/3 ; 

y 

\z'/ 


\-yPy 

a PzPx 

a/3,/3,, 

l + a/3 2 / 

w 


where /3 = v/c, y = (1 — |/3| 2 ) -1 / 2 and a = (y — 1)/|/3| 2 . Hint: The transformation 
must take the same form if both S and S' undergo the same spatial rotation. 

1.6 An inertial frame S' is related to S by a boost of speed u in the positive x-direction. 
Similarly, S" is related to S' by a boost of speed v in the /-direction. Find the 
transformation relating the coordinates ( ct,x,y,z ) and (ct ", x ", /', z") and hence 
describe how S and S" are physically related. 

1.7 The frames S and S' are in standard configuration. A straight rod rotates at a uniform 
angular velocity to' about its centre, which is fixed at the origin of S'. If the rod lies 
along the x'-axis at t' — 0, obtain an equation for the shape of the rod in S at t — 0. 
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1.8 Two events A and B have coordinates (t A , x A , y A , z A ) and {t B , x B , y B , z B ) respec¬ 
tively in some inertial frame S and are separated by a spacelike interval. Obtain an 
expression for the boost v required to transform to a new inertial frame S' in which 
the events A and B occur simultaneously. 

1.9 Derive the Doppler effect (1.12) directly, using the Lorentz transformation 
formulae (1.3). 

1.10 Two observers are moving along trajectories parallel to the y-axis in some inertial 
frame. Observer A emits a photon with frequency v A that travels in the positive 
x-direction and is received by observer B with frequency v B . Show that the Doppler 
shift v B /v A in the photon frequency is the same whether A and B travel in the same 
direction or opposite directions. 

1.11 Astronauts in a spaceship travelling in a straight line past the Earth at speed v — c/2 
wish to tune into Radio 4 on 198 kHz. To what frequency should they tune at the 
instant when the ship is closest to Earth? 

1.12 Draw a spacetime diagram illustrating the coordinate transformation corresponding 
to two inertial frames S and S' in standard configuration (i.e. where S' moves at a 
speed v along the positive x-direction and the two frames coincide at t = f — 0). 
Show that the angle between the x- and x'- axes is the same as that between the t- 
and t'- axes and has the value tan'^u/c). 

1.13 Consider an event P separated by a timelike interval from the origin O of your 
diagram in Exercise 1.12. Show that the tangent to the invariant hyperbola passing 
through P is a line of simultaneity in the inertial frame whose time axis joins P 
to the origin. Hence, from your spacetime diagram, derive the formulae for length 
contraction and time dilation. 

1.14 Alex and Bob are twins working on a space station located at a fixed position in 
deep space. Alex undertakes an extended return spaceflight to a distant star, while 
Bob stays on the station. Show that, on his return to the station, the proper time 
interval experienced by Alex must be less than that experienced by Bob, hence Bob 
is now the elder. How does Alex explain this age difference? 

1.15 A spaceship travels at a variable speed u(t) in some inertial frame S. An observer 
on the spaceship measures its acceleration to be /(r), where r is the proper time. 
If at r = 0 the spaceship has a speed w 0 in S show that 


u(t) — M 0 
1 — u(t)u 0 Jc 2 


ctanh </'(t), 


where c</f(r) = / Q T /(r'j dr'. Show that the velocity of the spaceship can never reach c. 

1.16 If the spaceship in Exercise 1.15 left base at time t = t — 0 and travelled forever 
in a straight line with constant acceleration /, show that no signal sent by base 
later than time t — c/f can ever reach the spaceship. By sketching an appropriate 
spacetime diagram show that light signals sent from the base appear increasingly 
redshifted to an observer on the spaceship. If the acceleration of the spaceship is g 
(for the comfort of its occupants), how long by the spaceship clock does it take to 
reach a star 10 light years from the base? 
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Our discussion of special relativity has led us to model the physical world as a 
four-dimensional continuum, called spacetime, with a Minkowski geometry. This 
is an example of a manifold. As we shall see, the more complicated spacetime 
geometries of general relativity arc also examples of manifolds. It is therefore 
worthwhile discussing manifolds in general. In the following we consider general 
properties of manifolds commonly encountered in physics, and we concentrate in 
particular - on Riemannian manifolds, which will be central to our discussion of 
general relativity. 


2.1 The concept of a manifold 

In general, a manifold is any set that can be continuously parameterised. The 
number of independent parameters required to specify any point in the set uniquely 
is the dimension of the manifold, and the parameters themselves are the coor¬ 
dinates of the manifold. An abstract example is the set of all rigid rotations of 
Cartesian coordinate systems in three-dimensional Euclidean space, which can be 
parameterised by the Euler angles. So the set of rotations is a three-dimensional 
manifold: each point is a particular - rotation, and the coordinates of the point 
are the three Euler angles. Similarly, the phase space of a particle in classical 
mechanics can be parameterised by three position coordinates (q x , qi, < 73 ) and 
three momentum coordinates ( p ,, p 2 , pf), and thus the set of points in this phase 
space forms a six-dimensional manifold. In fact, one can regard ‘manifold’ as just 
a fancy word for ‘space’ in the general mathematical sense. 

In its most primitive form a general manifold is simply an amorphous collection 
of points. Most manifolds used in physics, however, are ‘differential manifolds’, 
which are continuous and differentiable in the following way. A manifold is 
continuous if, in the neighbourhood of every point P, there are other points whose 
coordinates differ infinitesimally from those of P. A manifold is differentiable if 
it is possible to define a scalar field at each point of the manifold that can be 
differentiated everywhere. Both our examples above are differential manifolds. 
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The association of points with the values of their parameters can be thought of 
as a mapping of the points of a manifold into points of the Euclidean space of the 
same dimension. This means that ‘locally’ a manifold looks like the corresponding 
Euclidean space: it is ‘smooth’ and has a certain number of dimensions. 


2.2 Coordinates 

An /V-dimcnsional manifold M of points is one for which N independent real 
coordinates (x ] , x 2 ,..., x N ) are required to specify any point completely. 1 These 
N coordinates arc entirely general and are denoted collectively by x a , where it is 
understood that a — 1,2,..., N. 

As a technical point, we should mention that in general it may not be possible 
to cover the whole manifold with only one non-degenerate coordinate system, 
namely, one which ascribes a unique set of N coordinate values to each point, 
so that the correspondence between points and sets of coordinate values (labels) 
is one-to-one. Let us consider, for example, the points that constitute a plane. 
These points clearly form a two-dimensional manifold (called R 2 ). An example 
of a degenerate coordinate system on this manifold is the polar coordinates (r, <$>) 
in the plane, which have a degeneracy at the origin because 0 is indeterminate 
there. For this manifold, we could avoid the degeneracy at the origin by using, 
for example, Cartesian coordinates. For a general manifold, however, we might 
have no choice in the matter and might have to work with coordinate systems that 
cover only a portion of the manifold, called coordinate patches. For example, the 
set of points making up the surface of a sphere forms a two-dimensional manifold 
(called S 2 ). This manifold is usually ‘parameterised’ by the coordinates 6 and 
4>, but d) is degenerate at the poles. In this case, however, it can be shown that 
there is no coordinate system that covers the whole of S 2 without degeneracy; the 
smallest number of patches needed is two. In general, a set of coordinate patches 
that covers the whole manifold is called an atlas. 

Thus, in general, we do not require the whole of a manifold M to be covered 
by a single coordinate system. Instead, we may have a collection of coordinate 
systems, each covering some paid of M, and all these are on an equal footing. 
We do not regard any one coordinate system as in some way preferred. 


2.3 Curves and surfaces 

Given a manifold, we shall be concerned with points in it and with subsets of 
points that define curves and surfaces. We shall frequently define these curves 


The reason why the coordinates are written with superscripts rather than subscripts will become clear later. 
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and surfaces parametrically. Thus, since a curve has one degree of freedom, 
it depends on one parameter and so we define a curve in the manifold by the 
parametric equations 

x a = x a (u ) (a = 1,2,..., AO, 


where u is some parameter and x l {u), x 2 (u),..., x N (u ) denote N functions of u. 

Similarly, since a submanifold or surface of M dimensions (M < N) has M 
degrees of freedom, it depends on M parameters and is given by the N parametric 
equations 


x a = x a (u 1 ,u 2 , _ u M ) (a = 1,2 ,..., N). 


( 2 . 1 ) 


If, in particular - , M = N — 1 then the submanifold is called a hypersurface. In this 
case, the N — l parameters can be eliminated from these N equations to give one 
equation relating the coordinates, i.e. 


fix 1 , x 2 ,..., x N ) = 0. 


From a different but equivalent point of view, a point in a manifold is charac¬ 
terised by N coordinates. If the point is restricted to lie in a particular hypersurface, 
i.e. an (N — 1)-dimensional subspace, then the point’s coordinates must satisfy 
one constraint equation, namely 

fix 1 , x 2 ,..., x N ) = 0. 

Similarly, points in an M-dimensional subspace (M < N) must satisfy N — M 
constraints 


fiix 1 , x 2 ,..., x N ) = 0, 
f 2 ix l ,x 2 ,..., x N ) = 0, 

f N _ M ix 1 ,x 2 ,...,x M ) = 0, 

which is an alternative to the parametric representation (2.1). 


2.4 Coordinate transformations 

To locate a point in a manifold we use a system of N coordinates, but the choice of 
these coordinates is arbitrary. The important idea is not the ‘labels’ but the points 
themselves and the geometrical and topological relationships between them. 
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We may relabel the points of a manifold by performing a coordinate transfor¬ 
mation x a —> x' a expressed by the N equations 

( 2 . 2 ) 

giving each new coordinate as a function of the old coordinates. Hence we 
view a coordinate transformation passively as assigning the new primed coor¬ 
dinates (x n , x' 2 ,, x' N ) to a point of the manifold whose old coordinates arc 
(jc 1 , x 2 ,..., x N ). 

We will assume that the functions involved in (2.2) arc single-valued, contin¬ 
uous and differentiable over the valid ranges of their arguments. Thus by differ¬ 
entiating each equation in (2.2) with respect to each of the old coordinates x h we 
obtain the N x N partial derivatives dx' a /dx b . These may be assembled into the 
N x N transformation matrix 2 




dx n 

dx' 1 

dx' 1 

dx 1 

dx 2 

~dx" 

dx' 2 

dx' 2 

dx’ 2 

dx 1 

dx 2 

d?> 


dx' N 

dx’ N 

dx' N 

dx 1 

dx 2 

dx N 


I 


so that rows arc labelled by the index in the numerator of the partial derivative 
and columns by the index in the denominator. The elements of the transforma¬ 
tion matrix arc functions of the coordinates, and so the numerical values of the 
matrix elements arc in general different when evaluated at different points in the 
manifold. The determinant of the transformation matrix is called the Jacobian of 
the transformation and is denoted by 


J = det 



Clearly, the numerical value of / also varies from point to point in the manifold. 

If J fz 0 for some range of the coordinates x h then it follows that in this region 
we can (in principle) solve the equations (2.2) for the old coordinates x b and 
obtain the inverse transformation equations 

= x a (x'\ x' 2 ,..., x ,N ) (a = 1,2,..., AO. 


In general the notation [] denotes the matrix containing the elements within the square brackets. 
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In a similar manner to the above, we define the inverse transformation matrix 
\dx a /dx' h ] and the Jacobian of the inverse transformation J' = Aci\dx a /dx h \. 

Using the chain rule, it is easy to show that the inverse transformation matrix 
is the inverse of the transformation matrix, since 


* dx' a 8x b _ dx a _ §a _ 1 if a = c, 

h=l dx b dx' c dx fc c 0 if a / c, 


where we have defined the Kronecker delta 8 a c and used the fact that 


dx' a _ dx a 
dx lc dx c 


if a^f c, 


because the coordinates in either the unprimed or the primed set are independent. 
Since the two transformation matrices are inverses of one another, it follows that 
J' = l/J. 

If we consider neighbouring points P and Q in the manifold, with coordinates 
x a and x“ + dx a respectively, then in the new, primed, coordinate system the 
infinitesimal coordinate separation between P and Q is given by 


, /a dx' a , dx' a 0 dx' a N 

dx = —- dx 1 H-- dx~ H-1-— dx N , 

dx 1 dx 2 dx N 


where it is understood that the partial derivatives on the right-hand side arc 
evaluated at the point P. We can write this more economically as 


dx’ a 


N f)x' a 

v —dx b 
L dx b ax ■ 

b= t ox 


(2.3) 


2.5 Summation convention 

Our notation can be made more economical still by adopting Einstein’s summation 
convention : whenever an index occurs twice in an expression, once as a subscript 
and once as a superscript , this is understood to imply a summation over the index 
from 1 to N, the dimension of the manifold. 

Thus we can write (2.3) simply as 


dr' a 

dx' a = — -r dx h , 


where, once again, it is understood that all the partial derivatives arc evaluated at 
P. The index a appealing on each side of this equation is said to be a free index 
and may take on separately any value from 1 to N. We consider a superscript that 
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appeal's in the denominator of a partial derivative as a subscript (and vice versa). 
Thus the index b on the right-hand side in effect appears once as a subscript 
and once as a superscript, and hence there is an implied summation from 1 to 
N. An index that is summed over in this way is called a dummy index, because 
it can be replaced by any other index not already in use. For example, we may 
write 


dx' a , , dx ,a , c 

— r dx =- dx , 

dx b dx c 


since c was not already in use in the expression. 

Note that the proper use of the summation convention requires that, in any 
term, an index should not occur more than twice and that any repeated index must 
occur once as a subscript and once as a superscript. 


2.6 Geometry of manifolds 

So far, we have considered manifolds only in a very primitive form. We have 
assumed that the manifold is continuous and differentiable, but aside from these 
properties it remains an amorphous collection of points. We have not yet defined 
its geometry. 

Consider two infinitesimally separated points P and Q in the manifold, with 
coordinates x a and x a + dx a respectively (a = 1,2,, N). The local geometry 
of the manifold at the point P is determined by defining the invariant ‘ distance ’ 
or ‘ interval ’ ds between P and Q. In general, the distance between the points can 
be assigned to be any reasonably well-behaved function of the coordinates and 
their differentials, i.e. 3 

ds 2 = f(x a , dx a ). 

Clearly this function contains information on both the local geometry of the 
manifold at P and our chosen coordinate system. It is the assignment at each 
point in the manifold of a distance between points with infinitesimally different 
values of the coordinates that determines the local geometry of the manifold. To 
choose an example at random, a two-dimensional manifold, beloved of differential 
geometers for its richness, is the Finsler geometry, in which one may define 
coordinates f and f such that 


ds 2 = {d£ 4 + d£ 4 ) 1/2 . 


It is conventional to give the expression for ds 2 rather than ds. 
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2.7 Riemannian geometry 


For developing general relativity, we arc not interested in the most general geome¬ 
tries and can confine our attention to manifolds in which the interval is given by 
an expression of the form 4 (assuming the summation convention) 


ds 2 = g a b(x) dx a dx b . 


(2.4) 


Thus, such an interval is quadratic in the coordinate differentials. We shall see 
below that the g a b( x ) are the components of the metric tensor field in our chosen 
coordinate system. For the moment, however, we can consider them simply as 
a set of functions of the coordinates that determine the local geometry of the 
manifold at any point. Manifolds with a geometry expressible in the form (2.4) are 
called Riemannian manifolds. Strictly speaking, the manifold is only Riemannian 
if ds 2 > 0 always. If ds 2 can be positive or negative (or zero), as is the case 
in special relativity and general relativity, then the manifold should properly be 
called pseudo-Riemannian but is usually simply referred to as Riemannian. 

The metric functions g a i,(x) can be considered as the elements of a position- 
dependent N x N matrix. The metric functions can always be chosen so that 
g a h (x) = gba(x), i.e the matrix is symmetric. Suppose for argument’s sake that the 
functions g ab were not symmetric in a and b. Then we could always decompose 
the metric function into parts that arc symmetric and antisymmetric respectively 
in a and b , i.e. 


8ab( x ) = 2[&,*(*) + gba( x )] + j[8ab( x ) ~ 8ba( x )]- 


The contribution to ds 2 from the antisymmetric paid would be ^ [ g ab (x) — 
8ba( x )\ dx a dx b , which vanishes identically, as is easily confirmed on swapping 
indices in one of the terms, so that any antisymmetric paid of g ab can safely be 
neglected. Thus in an /V-dimcnsional Riemannian manifold there arc \N(N + 1) 
independent metric functions g a b(x). 

It is important to remember that the form of the metric functions can always 
be changed by making a change of coordinates. Since the interval between two 
points in the manifold is invariant under a coordinate transformation, using (2.4) 
and (2.3) we have 

ds 2 = g a b( x ) dx a dx h 


dx a dx b , d 

= 8nh\ x ) - 1 dx dx 

SabK ’ 8x ,c dx' d 

= 8cd ( x 0 d x ' C d x ’ d , 


(2.5) 


As we shall see in Chapter 7, this is a consequence of the equivalence principle. 
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where the new metric functions g' ab (x') in the primed coordinate system arc related 
to those in the unprimed coordinate system by 

Clearly, the metric functions g' ab (x') describe the same local geometry of the 
manifold as do the functions g ab {x). 

Since there arc N arbitrary coordinate transformations there arc really only 
\N(N +\) — N= \N(N — 1) independent degrees of freedom associated with the 
8ab( x )- 


2.8 Intrinsic and extrinsic geometry 

It is important to realise that the local geometry or curvature characterised by 
(2.4) is an intrinsic property of the manifold itself, i.e. it is independent of whether 
the manifold is embedded in some higher-dimensional space. 

It is, of course, difficult (or impossible) to imagine higher-dimensional curved 
manifolds, so it is instructive to consider two-dimensional Riemannian manifolds, 
which can often be visualised as a surface embedded in a three-dimensional 
Euclidean space. It is important to make a distinction, however, between the 
extrinsic properties of the surface, which are dependent on how it is embedded 
into a higher-dimensional space, and properties that arc intrinsic to the surface 
itself. 

This distinction is traditionally made clear by considering the viewpoint of 
some two-dimensional being (called a ‘bug’) confined exclusively to the two- 
dimensional surface. Such a being would believe that it is able to look and measure 
in all directions, whereas it is in fact limited to making measurements of distance, 
angle etc. only within the surface. For example, it would receive light signals that 
had travelled within the two-dimensional surface. Properties of the geometry that 
are accessible to the bug arc called intrinsic, whereas those that depend on the 
viewpoint of a higher-dimensional creature (who is able to see how the surface 
is shaped in the three-dimensional space) arc called extrinsic. 

The bug is able to define a coordinate system and measure distances in the 
surface (e.g. by counting how many steps it has to take) from one point to another. 
It can thus define a set of metric functions g a b(x) that characterise the intrinsic 
geometry of the surface (as expressed in the bug’s chosen coordinate system). 

Consider, for example, a two-dimensional plane surface, such as a flat sheet 
of paper, in our three-dimensional Euclidean space. The bug can label the entire 
sheet using rectangular Cartesian coordinates, so that the distance ds measured 
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over the surface between any pair of points whose coordinate separations arc dx 
and dy is given by 

ds 2 = dx 2 + dy 2 . 

If this sheet is then rolled up into a cylinder, the bug would not be able to detect 
any differences in the geometrical properties of the surface (see Figure 2.1). 

To the bug, the angles of a triangle still add up to 180°, the circumference of a 
circle is still 2tjt etc. The proof of this fact is simple - the surface can simply be 
unrolled back to a flat surface without buckling, tearing or otherwise distorting 
it. A more mathematical approach is to note that if one parameterises the surface 
of the cylinder (of radius a) using cylindrical coordinates (z, 4>), the distance ds 
measured over the surface between any two points whose coordinate separations 
are dz and dej) is given by 

ds 2 = dz 2 + a 2 dfr. 


By making the simple change of variables x = z and y = a<t> we recover the 
expression ds 2 = dx 2 + dy 2 , which is valid over the whole surface, and so the 
intrinsic geometry is identical to that of a flat plane. Thus the surface of a cylinder 
is not intrinsically curved; its curvature is extrinsic and a result of the way it is 
embedded in three-dimensional space. Even if one were to crumple up the sheet 
of paper (without tearing it), so that its extrinsic geometry in three-dimensional 
space was very complicated, its intrinsic geometry would still be that of a plane. 

The situation is somewhat different for a 2-sphere, i.e. a spherical surface, 
embedded in three-dimensional Euclidean space. Once again the surface is mani¬ 
festly curved extrinsically on account of its embedding. Additionally, however, 
it cannot be formed from a flat sheet of paper without tearing or deformation. 
Its intrinsic geometry - based on measurements within the surface - differs from 
the intrinsic (Euclidean) geometry of the plane. This problem is well known to 



Figure 2.1 Rolling up a flat sheet of paper into a cylinder. 
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cartographers. Mathematically, if we parameterise a sphere (of radius a) by the 
usual angular coordinates (0, tb) then 

ds 2 = a 2 (dd 2 + sin 2 6 cl4> 2 ), 

which cannot be transformed to the Euclidean form ds 2 = dx 2 + dy 2 over the 
whole surface by any coordinate transformation. Thus the surface of a sphere is 
intrinsically curved. 

We note, however, that locally at any point A on the spherical surface we 
can define a set of Cartesian coordinates, so that ds 2 = dx 2 + dy 2 is valid in the 
neighbourhood of A. For example, the street layout of a town can be accurately 
represented by a flat map, whereas the entire globe can only be represented by 
performing projections that distort distance and/or angles. As an idea of what can 
happen to local Cartesian coordinate systems far from the point A where they arc 
defined, consider Figure 2.2. If a bug starts at A and travels in the locally defined 
x-direction to 5, it observes that C still lies in the y-direction. If instead the bug 
travels from A to C, it finds that B still lies in the x-direction. The non-Euclidean 
geometry of the spherical surface is also apparent from the fact that the angles of 
the triangle ABC sum to 270°. 

We may take our discussion one step further, dispense with the three- 
dimensional space and embedding-related extrinsic geometry and consider the 
surfaces in isolation. Intrinsic geometry is all that remains with any meaning. 
For example, when we talk of the curvature of spacetime in general relativity, 
we must resist any temptation to think of spacetime as embedded in any ‘higher’ 
space. Any such embedding, whether or not it is physically realised, would 
be irrelevant to our discussion. Nevertheless, in developing our intuition for 



Figure 2.2 A two-dimensional spherical surface. 
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curved manifolds it oftens remains useful to imagine two-dimensional surfaces 
embedded in three-dimensional Euclidean space. 


2.9 Examples of non-Euclidean geometry 

Let us develop our intuition for non-Euclidean geometry by considering in more 
detail the surface of a sphere. We begin by imagining the usual Cartesian coor¬ 
dinate system (x, y, z) defining a Euclidean three-dimensional space with line 
element 

ds 2 = dx 2 + dy 1 + dz 2 . (2.6) 

Now, suppose that we have a sphere of radius a with its centre at the origin of 
our coordinate system. We will now ask the following question: what is the line 
element on the surface of the sphere? 

The equation defining the sphere is 

x 2 + y 2 + z 2 = a 2 . 

So, differentiating this equation, we obtain 

2x dx + 2 y dy + 2 zdz = 0, 


and we can write an equation for dz, 


dz = — 


x dx + ydy 
z 


— (x dx + y dy) 
[r/ 2 - (x 2 + y 2 )] 1/_ 


(2.7) 


Thus, equation (2.9) provides a constraint on dz that keeps us on the surface of 
the sphere if we arc displaced by small amounts dx and dy from an arbitrary 
point on the sphere (for example, the point A in Figure 2.2). Substituting for dz 
in (2.6) gives us the interval for such constrained displacements: 


ds 2 — dx~ -f- dy 2 T 


(x dx + y dy) 2 
a 2 — (x 2 + _y 2 ) ’ 


(2.8) 


which is the line element for the surface of the sphere in terms of our chosen 
coordinates (as shown in Figure 2.2), taking A as the origin of x and y. We 
see that this line element reduces to the Euclidean form ds 2 = dx 2 + dy 2 in 
the neighbourhood of A. Practically, one could construct the coordinate curves 
x = constant and y = constant on the surface of the sphere by creating a standard 
(x, y) coordinate grid in the tangent plane at A and ‘projecting’ vertically down 
onto the spherical surface. 
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We may obtain an alternative form for the line element by making the substi¬ 
tutions 


x = pcos</>, y = psinc/>. 


and after a little algebra we obtain 5 


ds 2 



+ p 2 d(f> 2 . 


(2.9) 


As above, one can construct the p and 4> coordinate curves on the sphere by creat¬ 
ing a standard (p, c p) coordinate system in the tangent plane at A and projecting 
vertically down onto the surface. We also note that this line element contains 
a ‘hidden symmetry’, namely our freedom to choose an arbitrary point on the 
sphere as the origin p = 0. 

The observant reader will have noticed that the line elements (2.8) and (2.9) 
have singularities at Jx 1 + y 2 = a, or, equivalently, p = a, corresponding to the 
equator of the sphere (relative to A). From our embedding picture, it is clear 
why the (x, y) and (p, c/r) coordinates cover the surface of the sphere uniquely 
only up to this point. We note, however, that there is nothing pathological in the 
intrinsic geometry of the 2-sphere at the equator. What we have observed is only 
a coordinate singularity, which has resulted simply from choosing coordinates 
with a restricted domain of validity. Although the embedding picture we have 
adopted gives both the (x, y) and (p, (b) coordinate systems a clear geometrical 
meaning in our three-dimensional Euclidean space, it is important to realise that 
a bug confined to the two-dimensional surface of the sphere could, if it wished, 
have defined these coordinate systems to describe the intrinsic geometry without 
any reference to an embedding in higher dimensions. 

We can make an analogous construction to find the metric for a 3-sphere embed¬ 
ded in four-dimensional Euclidean space. The metric for the four-dimensional 
Euclidean space is 

ds 2 = dx 1 + dy 2 + dz 2 + dw 2 , (2.10) 


and, by analogy with the example above, the equation defining a 3-sphere is 

x 2 + y 2 + z 2 + w 2 = a 2 . 


Differentiating as before gives 

2x dx + 2 y dy + 2 zdz + 2 w dw = 0, 


5 Note that the line elements (2.8) and (2.9) look different from the metric we would write down using 
standard spherical polars, ds 2 = a 2 dO 2 + a 2 sin 2 6 d(f> 2 . Nonetheless, both are valid line elements for the 
two-dimensional surface of a sphere. 
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and so substituting for dw in (2.10) gives the line element: 


j 2 _j 2 .j 2 .j 2 . (•xdx + ydy + zdz ) 

ds — dx + dy -T az H — -i— --- —— 

a 2 -(x 2 + y 2 + z 2 ) 


Transforming to spherical polar coordinates 


x = rsindcoscj), 
y = rsindsincf), 

z = rcosO, 


we obtain an alternative form for the line element: 


ds 2 = —- dr 2 + r 2 dd 2 + r 2 sin 2 ddtjr. 

a- — r- 


Notice that, in the limit a oc, the metric tends to the form 


ds 2 = dr 2 + r 2 dd 2 + r 2 sin 2 6 dtp 2 , 


( 2 . 11 ) 


which is simply the metric of ordinary Euclidean three-dimensional space ds 2 = 
dx 2 + dy 2 + dz 2 , rewritten in spherical polar coordinates. The line element (2.11) 
therefore describes a non-Euclidean three-dimensional space. We note that this 
line element also has a singularity, this time at r = a. As one might expect from 
our discussion above, this is once again just a coordinate singularity, although our 
existence as three-dimensional ‘bugs' makes the geometric reason for this less 
straightforward to visualise! 


2.10 Lengths, areas and volumes 

For a given set of metric functions g ab {x), (2.4), it is useful to know how to 
compute the lengths of curves and the ‘areas’ and ‘volumes’ of subregions of the 
manifold. 

The lengths of curves follow immediately from the line element. Suppose that 
the points A and B are joined by some path; then the length of this curve is given by 

L A b — f ds — f \g ab dx a dx b \ l/2 , 

J A J A 

where the integral is evaluated along the curve. As indicated, the absolute 
value of ds is taken before the square root is evaluated when considering 
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pseudo-Riemannian manifolds. If the equation of the curve x a ( u ) is given in 
terms of some parameter u then 


La d — 




Sab 


dx a dx b 
du du 


1/2 


du. 


( 2 . 12 ) 


where u A and u B arc the values of the parameter u at the endpoints of the curve. 

For the calculation of areas and volumes, let us begin by considering the simple 
case where the metric is diagonal, i.e. g a h(x ) = 0 for a ^ b. 6 In this case the line 
element takes the form 


ds~ — gn(dx 1 ) 2 + g 22 (dx 2 )~ 4- g NN (dx N ) 2 . (2-13) 


Such a system of coordinates is called orthogonal since, at all points in the 
manifold, any pair of coordinate curves cross at right angles, as is clear from 
(2.13). Thus, in orthogonal coordinate systems the ideas of area and volume can be 
built up simply. Consider, for example, an element of area in the (x ] , x 2 )-surface 
defined by x a = constant for a = 3,4,..., N. Suppose that the area element is 
defined by the coordinate lengths dx l and dx 2 (see Figure 2.3). The proper 
lengths of the two line segments will be ^/g ,, dx 1 and ^fgLidx 2 respectively. 
Thus the element of area is 7 


dA = y/lfrn&nl dx 1 dx 2 . 


(2.14) 



Figure 2.3 An element of area, on a manifold M, defined by the coordinate 
intervals dx 1 and dx 2 . The proper lengths dl 1 and dl 2 of these intervals are related 
to dx 1 and dx 2 by the metric functions. If the coordinate lines are orthogonal 
then the area of is dl 1 dl 2 . 


6 The general case is discussed in Section 2.14. 

7 We have implicitly assumed here that the manifold is strictly Riemannian. If the manifold is pseudo- 
Riemannian, some of the elements g ab in (2.13) may be negative (see Section 2.13), and then we require the 
modulus signs. 
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Similarly, for 3-volumes in the (jc 1 , x 2 , x 3 )-surface defined by x a = constant for 
a = 4, 5,..., N, we have 


d 3 V = y/\g^g^g^\ dx l dx 1 dx 3 . 


(2.15) 


We may, of course, define 3-volumes for any other three-dimensional subspace. 
We can define higher-dimensional ‘volume’ elements in a similar way until we 
reach the /V-dimcnsional volume element 

d N V = J\guS 22 ---gNN\dx 1 dx 2 -- - dx N . 


As examples of working with such metric functions, let us consider the non- 
Euclidean spaces discussed in Section 2.9. We begin with the line element (2.9), 

ds 2 — a ^ P +p 2 d(f) 2 , (2.16) 

a z — p- 

which describes two-dimensional geometry on the surface of a sphere in terms 
of the coordinates (p, c/>). the geometrical meanings of which are illustrated in 
Figure 2.4 assuming an embedding in three-dimensional Euclidean space. From 
(2.16) we see that this coordinate system is orthogonal, with g pp = a 2 /{a 2 — p 2 ) 
and g ^ = p 2 (no sums on p or (b). H Let us consider a circle defined by p = R, 



Figure 2.4 The surface of a sphere parameterised by the coordinates (p, </>) 
appearing in the line element (2.16). 


This form of notation is quite common, once a particular coordinate system has been chosen, and it is usually 
clear from the context that no summation is implied. 
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where R is some constant, and calculate its length, its area and the distance from 
its centre to the perimeter. 

From (2.12) and (2.16), the distance in the surface from the centre to the 
perimeter along a line of constant <:j) is 

D= C (a>- P iyv dp = a ™'{T)' 


while the circumference of the circle is given by 

n 27 T 

C — / RcI(I) = 2ttR. 

J o 

Similarly, from (2.14) we have, for the area of the spherical surface enclosed by C, 


A — 


n 

Jo Jo 


o Jo (a 2 — p 2 ) 1 / 2 


p dp df> = lira 2 


, R 


2 \ 1 / 2 - 


Note that if we rewrite the circumference C and area A in terms of the distance 
D then we obtain 


C — lira sin 



and 


A = 277 -a 2 


1 — cos 



(2.17) 


Thus, as D increases, both the circumference and area of the circle increase 
until the point when D = ira/2, after which both C and A become smaller as D 
increases. 

In fact there is a slight subtlety here. As noted earlier, if we attempt to param- 
eterise points beyond the equator of the sphere using the coordinates (p, 4>), 
the system becomes degenerate, i.e. there is more than one point in the surface 
with the same coordinates. The degenerate nature of the (p, f) coordinate system 
means that some care is required, for example, in calculating the total area of the 
surface. By symmetry this is given by 

/» 277 " o G 

A01 = 2/ / - TtTpPdp df = 47TO 2 . 

Jo Jo {a- — p-)L- 

Although we cannot easily visualise the geometry, we can perform similar 
calculations for the line element (2.11), 

ds 2 = ^ dr 2 + r 2 dd 2 + r 2 sin 2 6 df 2 , (2.18) 

a- — r- 

which describes a non-Euclidean three-dimensional space that tends to Euclidean 
three-dimensional space as a — oo. Let us consider a 2-sphere of coordinate radius 
r = R and calculate the circumference around the equator, the area, the volume 
and the distance from its centre to the surface of the sphere. 
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From (2.12) and (2.18), the distance from the centre to the surface along a line 
d = constant, f> = constant is 


D = 


-L 


a dr 


o (a 2 —r 2 ) 1 / 2 


= asm — 


Noting that the equator of the sphere is the curve r = R,d = ir/2, its circumference is 
C = f^ 77 Rdf> = 2itR, while the area of the surface r = R and the volume it 
encloses are obtained from (2.14) and (2.15) and read 


A = 

V = 


[ [ R 2 sind dd df> = AirR 2 , 

Jo Jo 


J 0 Jo 

„ 2 tt .77 . R ar 2 sin 6 
Jo Jo jo {a 2 — r 2 ) 1 / 2 


dr dd df> 


= 47TC/ 3 




It is not difficult to see that the familial - results of three-dimensional Euclidean 
space are recovered when R/a « I . Once again, we can rewrite our results in 
terms of D rather than R, and we find that C, A and V all have maximum values 
at D = ttci/2. By analogy with the above two-dimensional example, the total 
volume of this space is 


.277 .77 .a ar 2 sin 6 
Jo Jo Jo (a 2 — r 2 ) 1 / 2 


dr dd d<f = 2TT 2 a i . 


The three-dimensional non-Euclidean space described by the line element (2.18) 
thus has a finite volume. We can generate a line element for an infinite non- 
Euclidean three-dimensional space by making the substitution a = ib, i.e. choosing 
the ‘radius’ of the space to be pure imaginary. The line element (2.18) then becomes 


ds 2 


b 2 

b 2 + r 2 


dr 2 + r 2 dd 2 + r 2 sin 2 ddcj) 2 . 


If we again consider the sphere defined by r = R, we find easily that in this space 
C = 2ttR and A = 4 77^ 2 as before but the distance from the centre of the sphere 
to its surface is now given by D = bsinh~ l (R/b). In this case, one finds that C, 
A and the volume V of the sphere are all monotonically increasing functions. 


2.11 Local Cartesian coordinates 

We now introduce a key property of Riemannian manifolds, to which we have 
alluded in earlier sections. For the moment we will confine our attention to 
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manifolds that arc strictly Riemannian, so that ds 2 > 0 always, but subsequently 
we will extend our discussion to pseudo-Riemannian spaces, in which ds 2 can be 
of either sign (or zero). 

For a general Riemannian manifold, it is not possible to perform a coordinate 
transformation x a —> x' a that will take the line element ds 1 = g ab {x ) dx a dx b into 
the Euclidean form 


ds 2 = (dx n ) 2 + (dx' 1 ) 2 H-H (dx /N ) 2 = S ab dx! a dx ! b , 


at every point in the manifold. This is clear, since there arc N(N + 1)/2 inde¬ 
pendent metric functions g ab (x) but only N coordinate transformation functions 
x' a {x). As we shall now demonstrate, however, it is always possible to make a 
coordinate transformation such that in the neighbourhood of some specified point 
P the line element takes the Euclidean form. In other words, we can always find 
coordinates x' a such that at the point P the new metric functions g' ab {x') satisfy 


8ab( p . ) = S ab . 

Kb 1 

dx' c 


= 0 . 


Thus, in the neighbourhood of P. we have 


g'abK) = 8ab + 0[{x'-x' p ) 2 ]. 


(2.19) 

( 2 . 20 ) 


Such coordinates are called local Cartesian coordinates at P. 

From (2.5), the general transformation rule for the metric functions is 


, _ dx c dx d 
8ab ~ ~dx^~dx ib8cd ’ 

which we require to satisfy the conditions (2.19) and (2.20) at our chosen point 
P. If x a is an arbitrary given coordinate system and x' a is the desired system 
then there will be some relation x a (x r ) connecting the two sets of coordinates. 
Although we do not (as yet) know the required transformation, we can define it 
in terms of its Taylor expansion about P: 


x fl ( x ') = x" + 


dx a \ 

dFV, 

d 2 x a 


2 \dx ,b dx ,c 
d 3 x a 


(x' b -x'£) (x ,c - Xp) 


1 

+ - 


dx' b dx' c dx' d 


(x ,b - 4) {x' c - 4) (x ,d - f d ) + • 
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The numbers of free independent variables we have for this purpose arc as 
follows: 


(dx a / dx' b ) P has N 2 independent values, 

(d 2 x a / dx lb dx' c ) P has ] -N 2 (N + I) independent values, 

(<9 3 x a / dx' b dx' c dx' d ) P has ^N 2 (N + \)(N + 2) independent values, 

where we have made use of the fact that the second set of quantities is symmetric 
in b and c and the third set of quantities is totally symmetric in b, c and d. We 
may compare this with the number of independent parameters we may want to fix: 

g' ab {P) has \N(N +1) independent values, 

(dg'ab/dx' c ) P has ^A 2 (A + 1) independent values, 

(d 2 g'ab/bx' c dx' d ) P has |iV 2 (iV + l) 2 independent values. 

The first question is whether we can satisfy the requirement (2.19). This condition 
consists of N(N + l)/2 independent equations, and to satisfy them we have N 2 free 
values in (dx a / dx' h ) P . Therefore, they can indeed be satisfied, leaving N{N — l)/2 
numbers to spare! These spare degrees of freedom correspond exactly to the number 
of independent A-dimensional ‘rotations’ that leave 8 ah unchanged. 

The next question is whether we can satisfy the requirement (2.20). This 
condition consists of N 2 (N +1)/2 independent equations, and we can choose an 
equal number of free values {d 2 x a / dx' b dx' c ) P to satisfy them. 

The final question is whether we can continue in this way to higher orders. In 
other words, can we find a set of coordinates x' a such that {d 2 g' ab /dx' c dx' d ) P = 0? 
This condition consists of N 2 (N+ 1) 2 /4 independent equations, but we have only 
N 2 (N + I )(/V + 2)/6 free values in (d 2 g' ab /dx' c dx' d ) P , so these equations cannot 
in general be satisfied. This means that there arc N 2 (N 2 — 1)/12 ‘degrees of 
freedom' among the second derivatives (d 2 g' ah /dx ,c dx' d ) P , i.e. in general at least 
this number of second derivatives will not vanish. 

Although we have shown, in principle, that it is always possible to define local 
Cartesian coordinates at any given point P, we have not shown explicitly how to 
find such coordinates. We will return to this point in Chapter 3. 


2.12 Tangent spaces to manifolds 

To aid our intuition of local Cartesian coordinates, it is useful to consider the 
simple example of a two-dimensional Riemannian manifold, which we can often 
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Figure 2.5 The tangent plane T P to the curved surface M at the point P. 

consider as a generally curved surface embedded in three-dimensional Euclidean 
space. A simple example is the surface of a sphere, shown in Figure 2.2. As we 
have shown, at any arbitrary point P we can find coordinates x and y (say) such 
that in the neighbourhood of P we have 

ds 2 = dx 1 + dy 2 . 

It thus follows that a Euclidean two-dimensional space (a plane) will match the 
manifold locally at P. This Euclidean space is called the tangent space T P to the 
manifold at P. In other words, in terms of our embedding picture a plane can 
always be drawn at any arbitrary point on a two-dimensional Riemannian surface 
in such a way that it is locally tangential to the surface (see Figure 2.5). Although 
the tangent plane to a surface at P gives a useful way of visualising the tangent 
space of a manifold at a point, this view can be misleading. As we stressed 
earlier, a manifold should be regarded as an entity in itself: there is no need for 
a higher-dimensional space in which it and its tangent spaces are embedded. 

We may extend the idea of tangent spaces to higher dimensions. At an arbitrary 
point P in an A-dimensional Riemannian manifold we can find a coordinate 
system such that in the neighbourhood of P the line element is Euclidean. Thus, 
an /V-dimensional Euclidean space matches the manifold locally at P. Just as each 
point P of an embedded two-dimensional surface has its tangent plane, making 
contact with the surface at P, so each point P of a manifold has a tangent space 
T P attached to it. 


2.13 Pseudo-Riemannian manifolds 

Thus far we have confined our attention almost exclusively to strictly Rieman¬ 
nian manifolds, in which ds 2 > 0 always. In a pseudo-Riemannian manifold, 
however, ds 2 can be either positive, negative or zero and it is therefore much 
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harder to visualise even two-dimensional manifolds of this type. Nevertheless, 
the mathematical tools we have developed so far arc straightforwardly applied to 
pseudo-Riemannian manifolds with little modification. 

The simplest way to understand pseudo-Riemannian manifolds is to consider 
the transformation to local 'Cartesian’ coordinates at some arbitrary point P. You 
will notice from Section 2.11 that our argument showing that the condition (2.20) 
holds for the derivatives of the metric functions in a Riemannian manifold can 
be extended immediately to the pseudo-Riemannian case. Let us assume that 
the coordinate system x a already satisfies this condition. However, the condition 
(2.19) on the values of the metric functions themselves requires further investi¬ 
gation. Let us now attempt to obtain a new coordinate system x' a in which (2.19) 
is also satisfied. We note in passing that, in order for (2.20) to remain valid, the 
new coordinates x' a must be related to the old ones x a by a linear transformation, 
x' a = X a b x b , where the X a h arc constants. 

In general, at a point P the metric functions in the new coordinate system arc 
given in terms of the original metric functions by 

Let us define symmetric matrices G and G' having elements g ab (P ) and g' ab {P) 
respectively. Similarly, we can define a matrix X having elements (dx a /dx' b ) P . 
Then, in matrix notation, (2.21) can be written as 

G' = X t GX. 

Since G is symmetric, it can be diagonalised by this similarity transformation, 
provided that we choose the columns of X to be the normalised eigenvectors 
of G. Then G' = diag(Aj, A 2 ,..., \ N ), where A fl is the ath eigenvalue of G (the 
eigenvalues must all be real). 

In a strictly Riemannian manifold, ds 2 = g ab dx a dx b is always positive at any 
point P. Thus the matrix G = [g ab \ at any point must be positive definite, i.e. all 
its eigenvalues must be positive. At an arbitrary point in a pseudo-Riemannian 
manifold, however, ds 2 can be positive, negative or zero, depending on the 
direction in which one moves from P. Correspondingly, some of the eigenvalues 
of G arc negative. 

If we now scale our coordinates according to x' a —»■ x’ a /^J\k a \ (note that here 
there is no sum on a), we obtain at the point P 

G' = diag(±l,±l,...,±l), 

where the + and — signs depend on whether the corresponding eigenvalue is 
positive or negative. Thus, at any arbitrary point P in a pseudo-Riemannian 
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manifold, it is always possible to find a coordinate system x' a such that in the 
neighbourhood of P we have 


g'ab( X ') = Vab + 0[(x' - X'pj 2 ], 


where [ r} ab ] = diag(=tl, ±1,..., ±1). The number of positive entries minus the 
number of negative entries in [ rj ah | is called the signature of the manifold and is 
the same at all points. 

It follows that, at any arbitrary point P in a pseudo-Riemannian manifold, an 
/V-dimcnsional space with line element 

ds ‘■ = i (dx~) 2 ± ± (dx N )^ 

will match the manifold locally at P. Such a space is called pseudo -Euclidean and 
is the tangent space T P to the pseudo-Riemannian manifold at P. An example of a 
pseudo-Euclidean space is the four-dimensional Minkowski spacetime of special 
relativity, which has the line element 

ds 1 = d(ct) 2 — dx 2 — dy 2 — dz 2 

when expressed in coordinates corresponding to a Cartesian inertial frame. 
Minkowski spacetime thus has a signature of —2. 


2.14 Integration over general submanifolds 

In Section 2.10, we restricted our calculation of ‘volumes’ to coordinate systems 
x a that were orthogonal and to submanifolds that were obtained simply by allowing 
some of the coordinates to be constants. In fact neither of these simplifications is 
necessary, and we are now in a position to consider the general case. 

Let us begin by calculating the full M-dimensional volume element d N V in an 
/V-dirncnsional (pseudo-)Riemannian manifold. From Section 2.10, we know that 
if we arc working in an orthogonal coordinate system then this volume element 
is given by 

d N V = s/\g u g 2 2 ■ ■ ■ Saw I dx 1 dx 2 ■ ■ ■ dx N . 


For such a coordinate system the matrix G is given by 


G = [gab] = diag(g]i, #22’ • • • > Saw). 


so that its determinant is simply the product of the diagonal elements. 


det G — gng22 ■ ■ ■ 8nn- 
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It is usual to denote detG simply by the symbol g. Thus, we may rewrite the 
volume element as 


d N V = y/\g\ dx 1 dx 2 ■ ■ ■ dx N . 


( 2 . 22 ) 


What we will now show is that this expression remains valid for an arbitrary 
coordinate system. 

The key to proving the general result (2.22) for the volume element at some 
arbitrary point P in the manifold is to transform to local Cartesian coordinates 
x' a at P. We know that a small /V-dimcnsional region at P will have volume 
d N V = dx ,] dx' 2 ■ ■ ■ dx' N . In any other general coordinate system x a it is a well- 
known result that 


dx n dx' 2 ■ ■ ■ dx ,N = J dx [ dx 2 ■ ■ ■ dx N , (2.23) 


where the Jacobian factor / is given by 


J = det 


~dx' a ~ 

~dx* 


If, as in Section 2.13, we use X to denote the transformation matrix \dx a /dx' h ] then 
J = dct(X 1 ) = (det X) 1 . Defining matrices G and G' as those having elements 
g a b(P) and g' ah (P ) respectively, we have (see Section 2.13) 

G' = X t GX. (2.24) 


Taking determinants of both sides of (2.24) and denoting detG by g and detG 7 
by g' we obtain 

g' = (detX) 2 g= ^g. 

Since the x' a arc locally Cartesian coordinates, G' = diag(±l, ±1,..., ±1), where 
the number of positive and negative signs depends on the signature of the manifold. 
Thus we have g' = ±1, so that g = ±J 2 . Hence, we obtain the required result: 

d N V = dx n dx' 2 ■ ■ ■ dx’ N = ^J\g\ dx l dx 2 ■ ■ ■ dx N . 


We now turn to the question how to integrate over submanifolds that are not 
defined simply by setting some of the coordinates x a to be constant. Consider 
some M-dimensional subspace of an /V-dimcnsional manifold. In general, the 
subspace can be defined by the N parametric equations 

x a = x a (u\u 2 ,...,u M ), 

where the u l (i — 1,2,..., M) may be considered simply as a set of coordinates 
that parameterise the subspace. If we consider two neighbouring points in the 
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subspace whose parameters differ by du' then the coordinate separation between 
these points is simply 

dx a 

dx a = —r du'. 
du' 

Thus the distance ds between the points is given by 

dx b 

ds 2 = g ab dx a dx b = g ab —r—r du‘did, 
du' did 


which we may write as 


ds 2 = hjj d u'did, 


where the hjj are the induced metric functions on the subspace and arc given by 


dx a dx b 
hij ~ 8ab ~di7d^' 


(2.25) 


Thus we can now work simply in terms of this subspace and regal'd it as a 
manifold in itself. Thus the volume element for integrals over this subspace is 
given in terms of the parameters u‘ by 


d M V = sf\h\ du l du 2 • • • du M , 


where h = dct[/; (/ |. 

It is also worth noting here that the relation (2.25) is the key to determin¬ 
ing whether one can embed a given manifold in another manifold of higher 
dimension. Suppose we begin with an M-dimensional manifold possessing the 
metric hjj(u) when labelled with the coordinates if (i = 1,2,, M). In order to 
embed this manifold in an /V-dimcnsional manifold (where N > M) with metric 
g ab (x) in the coordinates x a (a — 1,2,..., N), then one must be able to satisfy the 
relation (2.25). 


2.15 Topology of manifolds 

In this chapter we have discussed only the local geometry of manifolds, which 
is defined at any point by the line element (2.4) giving the distance between 
points with infinitesimal coordinate separations. In addition to this local geometry 
a manifold also has a global geometry or topology. The topology of a manifold 
is defined by identifying certain sets of points, that is, regarding them as being 
coincident. For example, in Figure 2.1, we identified the line A A' with the line 
BB'. This property can be detected by a ‘bug’ on the surface, since by continuing 
in a straight line in a certain direction, it can get back to where it started. Thus a 
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topology (in this case the fact that the space is periodic in one of the coordinates) 
is an intrinsic property of a manifold. 

We shall see that general relativity is a ‘local’ theory, in which the local 
geometry (or curvature) of the four-dimensional spacetime manifold at any point 
is determined by the energy density of matter and/or radiation at that point. The 
field equations of general relativity do not constrain the global topology of the 
spacetime manifold. 


Exercises 

2.1 In three-dimensional Euclidean space R 3 , write down expressions for the change of 
coordinates from Cartesian coordinates [x a ] = (x, y, z) to spherical polar coordinates 
[. x' a ] — (r, 0, <l>). Obtain expressions for the transformation and inverse transformation 
matrices in terms of the primed coordinates. By calculating the Jacobians J and J' for 
the transformation and its inverse, find where the transformation is non-invertible. 

2.2 Write down the line element for three-dimensional Euclidean space in spherical 
polar coordinates x a and cylindrical polar coordinates x'“. Hence identify the metric 
functions in each coordinate system and show that they obey 

2.3 In three-dimensional Euclidean space a coordinate system x' a is related to the Carte¬ 
sian coordinates x a by 

x l —x ll +x' 2 , x 2 =x ll —x' 2 , x 3 — 2x' 1 x' 2 + x' 3 . 


Describe the coordinate surfaces in the primed system. Obtain the metric functions 
g' ab in the primed system and hence show that these coordinates are not orthogonal. 
Calculate the volume element dV in the primed coordinate system. 

2.4 Consider the surface of a 2-surface embedded in three-dimensional Euclidean space. 
In a stereographic projection, one assigns coordinates (p, (b) to each point on the 
surface of the sphere. The (^-coordinate is the standard azimuthal polar angle. The 
p-coordinate of each point is obtained by drawing a straight line in three dimensions 
from the south pole of the sphere through the point in question and extending the line 
until it intersects the tangent plane to the north pole of the sphere; the p-coordinate 
is then the distance in the tangent plane from the north pole to the intersection point. 
Show that the line element for the surface of the sphere in these coordinates is 


ds — 


dp 2 


(l + p 2 /a 2 ) 2 l+p 2 /« : 


df) 2 . 


At what point(s) on the sphere are these coordinates degenerate? If instead one works 
in terms of the Cartesian coordinates x and y in the tangent plane at the north pole, 
what is the corresponding form of the line element? At what point(s) on the sphere 
are these new coordinates degenerate? 
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2.5 Consider the surface of the Earth, which we assume for simplicity to be a 2-sphere 
of radius a. In terms of standard polar coordinates ( 6 , </>), the longitude of a point, 
in radians, rather than the usual degrees, is simply <f> (measured eastwards from the 
Greenwich meridian), whereas its latitude A = tt/2 — 0 radians. Show that the line 
element on the Earth’s surface in these coordinates is 

ds 2 = a 2 (d\ 2 +cos 2 A d<p 2 ). 

To make a map of the Earth’s surface, we introduce the functions x = x(A, <!>) and 
y = y(A, r/>) and use them as Cartesian coordinates on a flat rectangular piece of 
paper. Each choice of the two functions corresponds to a different map projection. 
The Mercator projection is defined by 



where W and H are the width and height of the map respectively. Find the line 
element for this projection. 

2.6 For the general map projection discussed in Exercise 1.5, show that the angle between 
two directions at some point on the Earth’s surface will equal the angle between the 
corresponding directions on the map, provided that the functions x and y are chosen 
such that 

fl(x, y){dx 2 + dy 2 ) — a 2 (d\ 2 + cos 2 A d(j> 2 ), 

for some function 0( x, y). Show that the Mercator projection satisfies this condition. 
Write down the general requirement on x and y for an equal-area projection, in 
which the area of any region of the map is proportional to the corresponding area on 
the Earth’s surface. Find such a projection. Is it possible to obtain a projection that 
simultaneously is equal-area and preserves angles? 

2.7 A conformal transformation is not a change of coordinates but an actual change in 
the geometry of a manifold such that the metric tensor transforms as 

gab(x) = (*)&,*(*)> 


where O(x) is some non-vanishing scalar function of position. In a pseudo- 
Riemannian manifold, show that if r“(A) is a null curve with respect to g ab (i.e, 
ds 2 = 0 along the curve), then it is also a null curve with respect to g ab . Is this true 
for timelike curves? 

2.8 A curve on the surface of a 2-sphere of radius a is defined parametrically by 9 — u, 
<f) — 2u — 77, where 0 < u < tt. Sketch the curve and show that its total length is 




Show that, in general, the length of a curve is independent of the parameter used to 
describe it. 
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2.9 Show that the line element of a 3-sphere of radius a embedded in four-dimensional 
Euclidean space can be written in the form 

ds 2 — a 2 [d\ 2 + sin 1 x(d6 2 + sin 2 9dcf> 2 )]. 

Hence, in this three-dimensional non-Euclidean space, calculate the area of the 
2-sphere defined by X — Xo- Also find the total volume of the three-dimensional 
space. 

2.10 Consider the three-dimensional space with line element 

dr 2 

ds 2 =-f- r 2 {d0 2 + sin 2 Qd<ff), 

1 — 2/x/r 

and calculate the following quantities: 

(a) the area of a sphere of coordinate radius r = R; 

(b) the 3-volume of a sphere of coordinate radius r — R\ 

(c) the radial distance between the sphere r — 2 /i and the sphere r — 3/x; 

(d) the 3-volume contained between the two spheres in part (c). 

Verify that your answers reduce to the usual Euclidean results in the limit fi —> 0. 

2.11 Prove the following results used in Section 2.11: 

(a) (dx a /dx' b ) P has N 2 independent values; 

(b) (d 2 x a /dx' b dx' c )p has \N 2 (N+ 1) independent values; 

(c) (d 3 x a /dx' b dx' c dx' d ) P has ^N 2 (N+ l)(V + 2) independent values; 

(d) g' ab {P) has ' N(N + 1) independent values; 

(e) (dg' ab /dx ,c )p has ' N 2 (N + 1) independent values; 

(f) (d 2 g' ab /dx' c dx ,d ) p has |V 2 (V + l) 2 independent values. 

Hence show that, in a general Riemannian manifold, at least N 2 (N 2 — 1)/12 of the 
second derivatives (d 2 g' ab /dx' c dx ,d ) p will not vanish in any coordinate system. 

2.12 Consider the two-dimensional space with line element 

dr 2 

ds 2 =-h r 2 d(f) 2 . 

1 — 2/ju/r ^ 

Using the result (2.25), show that this geometry can be embedded in three- 
dimensional Euclidean space, and find the equations for the corresponding two- 
dimensional surface. 

2.13 By identifying a suitable coordinate transformation, show that the line element 

ds 2 = ( c 2 — a 2 t 2 ) dt 2 — 2at dtdx — dx 2 — dy 2 — dz 2 , 
where a is a constant, can be reduced to the Minkowski line element. 
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The notion of a vector is extremely useful in describing physical processes and is 
employed in nearly all branches of mathematical physics. The reader should be 
familial - with vector calculus in two- and three-dimensional Euclidean spaces and 
with the description of vectors in terms of their components in simple coordinate 
systems such as Cartesian or spherical polar coordinates. 

The concept of vectors is also very useful in both special and general relativity, 
and we now consider how to generalise our familial' Euclidean ideas in order to 
define vectors in a general (pseudo-)Riemannian manifold and in arbitrary coor¬ 
dinate systems. For illustration, however, we will often consider two-dimensional 
Riemannian manifolds that can be envisaged as surfaces embedded in three- 
dimensional Euclidean space. An example is the surface of a sphere, which we 
might take to be the surface of the Earth (remembering to consider ourselves as 
truly two-dimensional ‘bugs’!). 


3.1 Scalar fields on manifolds 

Before considering vector fields on manifolds, let us briefly discuss scalar fields. 
A real (or complex) scalar field defined on (some region of) a manifold M assigns 
a real (or complex) value to each point P in (that region of) M: an example is 
the air temperature on the surface of the Earth. If one labels the points in M 
using some coordinate system x a then one can express the value at each point 
as a function of the coordinates 4>{x a ). The value of the scalar field at any point 
P does not depend on the chosen coordinate system. Thus, under an arbitrary 
coordinate transformation x a —> x' a , the scalar field is described by a different 
function cf'(x' a ) of the new coordinates, such that 

ct>'(x' a ) = ct>(x a ). 

Indeed, this is the defining characteristic for a scalar field. 
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3.2 Vector fields on manifolds 

A vector field defined on (some region of) a manifold M assigns a single vector 
to each point P in (that region of) M. The vector at P is often drawn as an 
extended directed line segment with its base at P, but this convention requires 
care fill interpretation on general manifolds. Once again it is convenient to illustrate 
our discussion by considering a two-dimensional manifold such as the spherical 
surface of the Earth. Let us consider, for example, the vector field defined by the 
wind velocity (at ground level). Wind velocity is measured at a given observation 
point and refers solely to that point, despite the visual convenience of showing it 
on a chart as an arrow apparently extending for a long distance. It is an example 
of a local vector. Other examples include momentum, current density and velocity 
in general. Such vectors are defined at a given point P. More accurately, they can 
be measured by an observer (bug) in a laboratory covering a small region of the 
manifold in the neighbourhood of P. 

At an arbitrary point P in the manifold, any local vector v lies in the tangent 
space T P to the manifold at P. Indeed, T P consists of the set of all (local) vectors 
at the point P. This may be visualised simply for two-dimensional manifolds by 
embedding them as surfaces in three-dimensional Euclidean space (see Figure 3.1), 
but the idea is easily extended to higher dimensions and can be defined indepen¬ 
dently of any embedding. As we discussed in Chapter 2, the tangent space at any 
point of a (pseudo-)Riemannian manifold is a (pseudo-)Euclidean space of the 
same dimensionality. Moreover, at an arbitrary point P, local vectors obey all the 
usual rules of vector algebra in (pseudo-)Euclidean geometry. 

It is important to realise, however, that local vectors defined at different points 
P and Q in the manifold lie in different tangent spaces. Thus there is no way of 
adding local vectors at different points. Other notions that must be abandoned are 
those of position vectors and displacement vectors, which clearly arc not locally 



Figure 3.1 Local vectors defined at the point P lie in the tangent space T P to 
the manifold at that point. 



3.3 Tangent vector to a curve 


55 



Figure 3.2 The displacement vector between two general points P and Q does 
not lie in the manifold M, unless the manifold is itself Euclidean. 

defined. Using an embedding picture of a two-dimensional manifold, this is may be 
visualised as shown in Figure 3.2. The ‘displacement vector’ connecting the points 
P and Q does not lie in the manifold and thus has no intrinsic geometrical meaning. 
Heuristically, however, we can define the displacement vector 8s between two 
nearby points P and Q, since this is a local quantity. In the limit Q P. the 
vector 8s lies in the tangent space at P. 

Clearly, if the original manifold is itself (pseudo-)Euclidean then the tangent 
space at any point coincides with the manifold. Thus vectors defined at different 
points in the manifold do lie in the same space, and the notions of position and 
displacement vectors arc valid. This reflects our common experience of vector 
algebra. 


3.3 Tangent vector to a curve 


The most obvious example of a vector field defined on (a subregion of) a manifold 
is the tangent vector to a curve G , which is defined at each point along G. The 
notion of a tangent vector to a curve is also central to our subsequent development 
of basis vectors, described below. 

Consider a curve C in an /V-dimcnsional manifold. This curve may be described 
by the N parametric equations x a (u), where u is some general parameter that 
varies along the curve. At any point P along G, the tangent vector t to the curve, 
with respect to the parameter value u, is defined as 


8s 

t = lim —, 
Su--Q 8ll 


(3.1) 


where 8s is the infinitesimal separation vector between the point P and some 
nearby point Q on the curve corresponding to the parameter value u + 8u. Clearly 
t will lie in the tangent space T P at the point P: this is illustrated in Figure 3.3. 
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Figure 3.3 The tangent vector t to the curve G at a point P. 

Although the heuristic approach we have adopted here is perfectly adequate for 
our purposes, in a general manifold the formal mathematical device for construct¬ 
ing the tangent vector to some curve at P is to identify t with the directional 
derivative operator along the curve at that point. This is discussed further in 
Appendix 3A and, in fact, enables one to give a precise mathematical meaning to 
the general notion of a vector in a non-Euclidean manifold. 


3.4 Basis vectors 

As we have seen, a vector field on a manifold is defined simply by giving, in 
a smooth manner, a prescription for a local vector v(x) at each point in the 
manifold. At each point P the vector lies in the tangent space T P at that point. This 
vector is a geometrical entity, defined independently of any coordinate system 
with which we choose to label points in the manifold. Nevertheless, at each point 
P we can define a set of basis vectors e a for the tangent space T P , the number of 
such vectors being equal to the dimension of T P and hence of M (how this may 
be achieved will be discussed shortly). Any vector at P can then be expressed 
as a lineal - combination of these basis vectors, provided that they are linearly 
independent, which we will assume is always the case. Thus, we can express the 
local vector field v(x) at each point in terms of basis vectors e a (x) defined at 
each point: 

v(x) = v a (x) e a (x). 

The numbers v a (x) are known as the contravariant components of the vector field 
v(x) in the basis e a (x). 

For any set of basis vectors e a (x ), we can define a second set of vectors called 
the dual basis vectors. Instead of denoting the dual basis vectors by some other 
kernel letter, it is the convention to denote a member of this second basis set by 
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e a {x). Although the positioning of the index may seem odd (not least because of 
the possible confusion with powers), it enables effective use of the summation 
convention that we shall adopt in due course. At any point P, the dual basis 
vectors arc defined by the relation 


e a {x).e h (x) = 8l 


(3.2) 


so that e a and e a form reciprocal systems of vectors. 

The dual basis vectors at P also lie in the tangent space T P and form an 
alternative basis for it. 1 Thus, we can also express the local vector field v(x) at 
each point as a linear combination of the dual basis vectors e a (x) defined at that 
point: 

v(x) = vjx) e a (x). 


The numbers v a (x) arc known as the covariant components of the vector v(x) in 
the basis e a (x). 

Using the relation (3.2) we can find simple expressions for the contravariant 
and covariant components of a vector v. For example, 2 


ve a = v b e h -e a = v h 8 a b = v a , 


where we have used the fact that 8 b can be used to replace one index with another. 
Thus we may write v a = v ■ e“. Similarly, we may show that v a = v e a . We now 
consider how a set of basis vectors (and their duals) may be constructed at each 
point P in the manifold. 


Coordinate basis vectors 


An obvious basis in which to describe local vectors is the coordinate basis. In any 
particular coordinate system x a , we can define at every point P of the manifold 
a set of N coordinate basis vectors 


lim -—, 

8x a ->0 8x a 


(3.3) 


where 8s is the infinitesimal vector displacement between P and a nearby point Q 
whose coordinate separation from P is 8x a along the x a coordinate curve. Thus 
e a is the tangent vector to the x“ coordinate curve at the point P. This set of 
vectors provides a basis for the tangent space T P at the point P (see Figure 3.4). 


1 More precisely, these vectors define the dual tangent space Tp at P. but this subtlety need not concern us 
here. 

2 From now on we will no longer make explicit the dependence of the basis vectors and components on the 
position x in the manifold, except where including this argument makes the explanation clearer. 
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Figure 3.4 The coordinate basis vectors e a at a point P in a manifold are the 
tangent vectors to the coordinate curves in the manifold and form a basis for the 
tangent space at P. 


From the definition (3.3), we see that if two nearby points P and Q have 
coordinates x a and x a + dx a respectively, where now we allow dx a to be non-zero 
for all a, then their infinitesimal vector separation is given by 

ds — e a (x)dx a . (3.4) 


We can use this expression to relate the inner product of the coordinate basis 
vectors at some arbitrary point P to the value of the metric functions g ab (x) at 
that point. From (3.4), we have 

ds 1 = ds ■ ds = ( dx a e a ) • (dx h e h ) = (e a ■ e b ) dx a dx h . 


Comparing this with the standard expression ds 2 = g ab {x) dx a dx b , (2.4), for the 
line element, we find that 


e a (x)-e b (x) = gab(x). 


(3.5) 


Thus, quite generally, in a coordinate basis the scalar product of two vectors is 
given by 

vw = ( v a e a ) • ( w b e h ) = g ab v a w b . 


If the basis e a (x) is dual to a coordinate basis e a (x) then the ^-coordinate 
distance between two nearby points separated by the displacement vector ds is 
given by 

dx a = e a ■ ds. 


Moreover, in this case we may use the dual basis vectors to define the quantities 
g ab {x) = e a (x)-e b (x), 


(3.6) 
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which, as we will show, form the contravariant components of the metric tensor 
and arc in general different from the quantites g ab (x ); we will return to these 
later. 


Orthonormal basis vectors at a point 

At any given point P in a manifold, it is often useful to define a set of orthonormal 
basis vectors e a in T P , which are chosen to be of unit length and orthogonal to 
one another. This is expressed mathematically by the requirement that at P 


*a-h = Vab’ 


(3.7) 


where [r] ah \ = diag(±l, ±1,..., ±1) is the Cartesian line element of the tangent 
space T P and depends on the signature of the (in general) pseudo-Riemannian 
manifold (see Section 2.13). These orthonormal basis vectors need not be related 
to any particular coordinate system that we arc using to label the manifold, 
although they can always be defined by, for example, giving their components 
in a coordinate basis. Moreover, it is clear from (3.7) that the orthonormal basis 
vectors e a at P arc in fact the coordinate basis vectors of a coordinate system for 
which g ab (P) = r] a i, (or g a j,(P) = 8 ab for a strictly Riemannian manifold). 


3.5 Raising and lowering vector indices 

Unless otherwise stated, we will assume that we arc working with a coordinate 
basis, as discussed above, and its dual. The contravariant and covariant compo¬ 
nents in these bases arc equally good ways of specifying a vector. The link 
between them is found by considering the different ways in which one can write 
the scalar product v w of two vectors. First, we can write 

vw = (v a e a ) ■ ( w b e b ) = (e a ■ e b )v a w b = g ab v a w b , 

where we have used the contravariant components of the two vectors. Similarly, 
using the covariant components, we can write the scalar product as 

vw = (v a e a ) ■ (■ w b e b ) = (e a -e h )v a w b = g ab v a w b . 

Finally, we could express the scalar product in terms of the contravariant compo¬ 
nents of one vector and the covariant components of the other, 

v-w = (■ v a e a ) • (w h e h ) = v a w b (e a -e b ) = v a w h 8 b = v a w a ; 

similarly, we could write 

vw = ( v a e a ) • (■ w b e b ) = v a w b (e a -e b ) = v a w b 8 a b = v a w a . 
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By comparing these four alternative expressions for the scalar product of two 
vectors, we can deduce one of the most useful properties of the quantities g ab and 
g ab . Since g ab v a w b = v a w a holds for any arbitrary vector v, it follows that 

g ab w b = w a , 

which illustrates the fact that the quantities g ab can be used to lower an index. 
In other words, we can obtain the covariant components of a vector from its 
contravariant components. By a si mi lar argument, we have 



so that the quantities g ab can be used to perform the reverse process of raising 
an index. It is straightforward to show that the coordinate and dual basis vectors 
themselves are related in an analogous way by 

e a = Sab^ b and = 8 ab e b - 

We will now prove the useful result that the matrix \g ah ] containing the 
contravariant components of the metric tensor is the inverse of the matrix \g ab \ that 
contains its covariant components. Using the index-lowering and index-raising 
action of g ab and g ab on the components of an arbitrary vector v, we find that 

C V V § Vfo g gbcV 9 


but since v is arbitrary we must have 


b „ ofl 

8 be ®c* 


(3.8) 


Denoting the matrix [g ab \ by G and the matrix \g ah ] by G, this equation 
can be written in matrix form as GG = I. Hence G and G are inverse matrices. 


3.6 Basis vectors and coordinate transformations 

Let us consider a coordinate transformation x“ —> x' a on a manifold. There is 
a simple relationship between the coordinate basis vectors e a associated with 
the coordinate system x a and the coordinate basis vectors e' a associated with the 
new system of coordinates x ,a . It can be found by considering the infinitesi¬ 
mal displacement vector ds between two nearby points P and Q. Clearly, this 
displacement cannot depend on the coordinate system being used, so we must have 

ds = dx a e a = dx a e a . 
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Noting that dx a = (dx a /dx' h ) dx' b , we find that at any point P the two sets of 
coordinate basis vectors arc related by 

(3.9) 

where the partial derivative is evaluated at the point P. Repeating this calculation 
using the dual basis vectors, we find that 

(3.10) 

Using (3.9) and (3.10), we can now calculate how the components of any 
general vector v must transform under the coordinate transformation. Since a 
vector is a geometrical entity that is independent of the coordinate system, we 
have (for example) 

v = v a e a = v' a e' a . 




So, the new contravariant components arc given by 



Similarly, the new covariant components arc given by 



3.7 Coordinate-independent properties of vectors 

As we have seen, in a coordinate basis and its dual the scalar product v w of two 
vectors at each point P of the manifold can be written in four ways: 



Using the transformation properties of the metric coefficients g ab and those of the 
vector components, it is straightforward to show that these expressions yield the 
same result in any other coordinate system. 

In a strictly Riemannian manifold the scalar product is positive definite , which 
means that g ab v a v b > 0 for all vectors v a , with g ah v a v h = 0 only if v a = 0. In a 
pseudo-Riemannian space, however, this condition is relaxed and leads to some 
rather odd properties, such as the possibility of non-zero vectors having zero 
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length. We must therefore make definitions that allow us to deal with such prop¬ 
erties in a way that extends and generalises familiar concepts in Euclidean space. 
The length of a vector |v| is defined in terms of its components by 

\gabV a A 1/2 = \g ab V a V b \ l/2 = \v a V a \ 1/2 . 

A unit vector has length unity. As remarked above, in a pseudo-Riemannian 
manifold we can have \v a v a \ 1 ' 2 = 0 for v a ^ 0, in which case the vector v is 
described as null. 

The angle 6 between two non-null vectors v and w is defined by 

v a w a 

cos 0 = t -ttttt;- rrrr • 

\v b v b \^ 2 \w c w c \ 1 ^ 2 

In a pseudo-Riemannian manifold, this formula can lead to |cos0| > 1, resulting 
in a non-real value for 6. 

Two vectors arc orthogonal if their scalar product is zero. This definition makes 
sense even if one or both of the vectors is or arc null. In fact, a null vector is a 
non-zero vector that is orthogonal to itself. 


3.8 Derivatives of basis vectors and the affine connection 

As we have said, local vectors at different points P and Q in a manifold lie 
in different tangent spaces, so there is no way of adding or subtracting them. 
In order to define the derivative of a vector field, however, one must compare 
vectors at different points, albeit in the limit where the distance between the points 
tends to zero. We will adopt here an intuitive approach that is sufficient for our 
purposes in developing vector calculus on curved manifolds and provides a simple 
geometrical picture. Specifically, on this occasion, we will assume the manifold 
to be embedded in a higher-dimensional (pseudo-)Euclidean space, which thus 
allows vectors at different points to be compared. 3 

In some arbitrary coordinate system x a on the manifold, let us consider the basis 
vectors e a at two nearby points P and Q with coordinates x a and x“ + 8x a respec¬ 
tively (see Figure 3.5). In general, the basis vectors at Q will differ infinitesimally 
from those at P. so that 

e a (Q) = e a (P) + 8e a . 


3 It is worth noting that one can embed any four-dimensional torsionless (pseudo-)Riemannian manifold in 
some (pseudo-)Euclidean space of sufficiently higher dimension; see, for example, J. Nash, The imbedding 
problem for Riemannian manifolds, Annals of Mathematics 63, 20-63, 1956 and C. Clarke, On the global 
isometric embedding of pseudo-Riemannian manifolds. Proceedings of the Royal Society A314, 417-28, 
1970. Indeed, recent theoretical work on braneworld models suggests that our spacetime may indeed be 
embedded in some higher-dimensional manifold! Alternatively, one can define the derivative of a vector field 
on a general manifold without using an embedding picture, but in a rather more formal manner; see, for 
example, R.M. Wald, General Relativity , University of Chicago Press, 1984. 
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Figure 3.5 The basis vectors e a {P) and e a (Q) lie in the tangent spaces to the 
manifold M at the points P and Q respectively. 


The standard partial derivative of the basis vector is given by 8e a /8x c in the 
limit 8x c —»■ 0. In general, however, the resulting vector will not lie in the tangent 
space to the manifold at P. We thus define the derivative in the manifold of the 
coordinate basis vector by projecting into the tangent space at P, 


de a 

f 8e a 


lim —^ 

dx c 

V 8 x c — >-0 OX c 


(3.11) 


Now we can expand this derivative vector in terms of the basis vectors e a (P) at 
the point P, and write 


— = r\ r e„, 

dx c ac b 


(3.12) 


where the N 3 coefficients T b c are known collectively as the affine connection or, 
in older textbooks, the Christojfel symbol (of the second kind) at the point P. 
From (3.11), it is also clear that the derivative operator obeys Leibnitz’ theorem. 
By taking the scalar product of (3.12) with the dual basis vector e d and using the 
reciprocity relation (3.2), we can also write the affine connection as 4 


r* = e h ■ d n e„ 


(3.13) 


Furthermore, by differentiating the reciprocity relation e a ■<?/, = 8 b a with respect 
to the coordinate x c , we find that 


d c {e a -e h ) = (d c e a )-e b + e a • ( d c e b ) = 0. 


From now on, we shall often use the shorthand d c to denote d/dx c . We also note here that, in some textbooks, 
an even more terse notation is used, in which partial differentation is denoted by a comma. For example, the 
partial derivative d c v a of the contravariant components of a vector would be written v a c . 
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Then, on using (3.13), we find that the derivatives of the dual basis vectors with 
respect to the coordinates arc given by 

= -T a bc e b . (3.14) 

The expressions (3.12-3.14) will be used extensively in our subsequent discus¬ 
sions. 


3.9 Transformation properties of the affine connection 

From the expression (3.13) for the affine connection, 

T a bc = ea -^, (3.15) 

we see that, in some new coordinate system x' a , it is given by 


rv = «' 1 


K 

dx' c ’ 


Substituting the expressions (3.9) and (3.10) for the new basis and dual basis 
vectors, we find 


dx' a , d (dxf 

bc ~~d^ 6 '~d7 c \(9x^ Cf 

_dx' a d / dxf de f d 2 xf \ 

dx d dx' c + dx' c dx' b f ) 

_ dx' a dx f dx 8 d de f dx' a d 2 x f 

dx d dx' b dx' c dx g dx d dx ,c dx' h 

dx' a dxf dx g , dx' a d 2 x d 

=- T d t H- 

dx d dx' b dx ,c ts dx d dx' c dx' b ’ 


where in the last line we have used the reciprocity relation (3.2) between the 
basis and dual basis vectors. We will see later that, because of the presence of 
the last term on the right-hand side of (3.16), the T a bc do not transform as the 
components of a tensor. 

By swapping derivatives with respect to x and x! in the last term on the 
right-hand side of (3.16), we arrive at an alternative (but equivalent) expression: 


_ dx ,a dxf dx 8 d 

dx d dx f d 2 x' a 

bc dx d dx' b dx' c fg 

dx fh dx' c dx d dxf 


(3.17) 
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3.10 Relationship of the connection and the metric 

The observant reader will have noticed that there was some arbitrariness in how 
we introduced the affine connection in (3.12). We could just as easily have written 
(3.12) with Y b ac replaced by Y b ca , i.e. with the two subscripts interchanged. In 
a general Riemannian manifold, these two sets of quantities arc not necessarily 
equal to one another. In fact, one can show that the quantities 

T b ac = r b ac -r b ca (3.18) 


are the components of a third-rank tensor (see Chapter 4) called the torsion tensor. 
For our considerations of standard general relativity, however, we can assume 
that our manifolds arc torsionless, so that T h ac = 0 in any coordinate system. 5 
Hence, from here onwards, we will assume (unless otherwise stated) that the 
affine connection is symmetric in its last two indices, i.e. 



(3.19) 


In a manifold that is torsionless, so that (3.19) is satisfied, there is a simple 
relationship between the affine connection Y b ac and the metric functions g ab , 
which we now derive. From (3.5) we have g ab = e a ■ e b . Differentiating this 
expression with respect to x c , we obtain 


Sab i^c^a) ' ^b 4 ” ®a ’ 

r ac e d • c b + e a • r bc c d 

= ^ac8db + ' bcSad • 0 * 20 ) 


By cyclically permuting the indices a , b , c, we obtain two equivalent expressions, 

^b&ca cbSda ^ abScd’ 

^aSbc r baSdc ^ caSbd• 


Using these three expressions, we now form the combination 


^cSab ^bSca ^aSbc 


— acSdb + ^bcSad + r d cb gda + Y‘ l ab g cd - T d ba g dc - T d ca g hd — 2T d cb g ad . 


where, in obtaining the last line, we have used the assumed symmetry properties 
(3.19) of the affine connection and the symmetry of metric functions. Multiplying 


5 It is straightforward to show that any (pseudo-)Riemannian manifold that can be embedded in some (pseudo-) 
Euclidean space of higher dimension must be torsionless. 
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through by g ea , recalling from (3.8) that g ea g ad = 8 e a and relabelling indices, we 
finally obtain 


r°6c — \g ad {db8dc + dcSbd — d d gbc)- 


(3.21) 


In fact, the quantity defined by the right-hand side in (3.21) is properly called the 
metric connection and is often denoted by the symbol In a manifold with 
torsion, it will differ from the affine connection defined by (3.11). As we have 
shown, however, in a torsionless manifold the affine and metric connections arc 
equivalent, and so Y a bc is usually referred to simply as the connection. Unless 
otherwise stated, we will follow this convention from now on. 

Equation (3.21) is very important , because it tells us how to compute the 
connection at any point in a manifold. In other words, if one knows the metric g ab 
in some coordinate system x a then one can form the derivatives of g ab appealing 
in (3.21) and hence calculate all the numbers Y a bc at any point. 

We finish this section by establishing a few useful formulae involving the 
connection T a bc and the related quantities 



It is straightforward to show that Y a bc — g ad ^dbc- From (3.21), we find that 

r abc = 5( d b8ac + d c8ba ~ d a8bc)- (3-22) 

The quantity Y abc is traditionally known as a Christoff el symbol of the first kind. 
Adding Y bac to T ahc gives 


d C 8ab ~ F abc + F bac’ (3.23) 

which allows us to express partial derivatives of the metric components in terms 
of the connection coefficients. If we denote the value of the determinant det[g ai ] 
by g then the cofactor of the element g ab in this determinant is gg ab (note that 
g is not a scalar: changing coordinates changes the value of g at any point). It 
follows that d c g = gg ab (d c g ab ), so from (3.23) we have 

s c g = gg ab ( r abc + r bac ) = g (rV + r\ c ) = 2 g r a ac . (3.24) 

The implied summation over a is an example of a contraction over a pair of 

indices (see Chapter 4); T a ac means simply r 1 lc + r 2 2c .H - \-T N Nc . Thus the 

contraction of the connection coefficients (3.21) is given by 


F a ab=\g ld b8=k d b ]n \8l 


( 3 . 25 ) 
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the modulus signs being needed if the manifold is /wcMc/o-Riemannian. Alterna¬ 
tively, we can write 


r a ab = d b In ,/iii = ~^= d b y/\g\. (3.26) 

V\s\ 


3.11 Local geodesic and Cartesian coordinates 

In Chapter 2, we showed that at any point P in a pseudo-Riemannian manifold it 
is possible in principle to find local Cartesian coordinates x' a such that 


8ab( p ) = Pab . 


Kb 

dx' c 


= 0 , 
p 


(3.27) 

(3.28) 


where [rj ab ] — diag(±l, ±1,..., ±1). The number of positive entries in [r) ab \ 
minus the number of negative entries is the signature of the manifold. Supposing 
that we start with some general system of coordinates x “, we now show how to 
obtain local Cartesian coordinates in practice. 

Let us begin by demanding that our new coordinate system x! a satisfies the 
condition (3.28) but not necessarily the condition (3.27). From our expression 
(3.20) for the derivative of the metric in terms of the connection, we see that 
the condition (3.28) will be satisfied if the connection coefficients in the new 
coordinate system vanish at P, i.e. 


r% c (P) = o. 


(3.29) 


Conversely, from (3.21) we see that the condition (3.28) implies (3.29). The 
condition (3.29) makes much simpler the mathematics of parallel transport, covali¬ 
ant differentiation and intrinsic differentiation (see later). Coordinates for which 
(3.29) holds are generally referred to as geodesic coordinates about P, but this is 
not always appropriate since they need not be based on geodesics (which we will 
also discuss later). 

Suppose that we start with some arbitrary coordinate system x “, the ‘original’ 
system, in which the point P has coordinates x a P . Let us now define a new system 
of coordinates x' a by 


(3.30) 
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where the T a bc (P) arc the connection coefficients at P in the original coordinate 
system. Clearly, the origin of the new coordinate system is at P. Differentiation 
of (3.30) with respect to x d yields 


3x' a 

— = 8“ l + r a dc (p)(x c -x‘ P ), 

so that, at the point P, dx' a /dx d = 8 a d \ its inverse is given by dx a /dx td = 8 a d . 
Differentiating again we obtain 

3 2 \' a 

■ =r a dc (p)8 c e = r de (p). 


dx e dx d 


If we now substitute these results into the expression (3.17) for the transformation 
properties of the connection, we find that 

F ,a bc(P) = 8 a d 8 f b 8'iY d fg (P) - 8 d 8{r a df (p) = r a bc (p) - r a hc (p) = o. 


So in the new (primed) coordinate system the connection coefficients at P arc 
zero, and from (3.29) we have a system of geodesic coordinates at P. 

The metric functions g' ah {P) in the geodesic coordinates x'“ will not necessarily 
satisfy the condition (3.27). Nevertheless, we can obtain such a system of local 
Cartesian coordinates by making a second linear coordinate transformation 



where the coefficients X a b are constants. Thus we can bring the metric g" b {P) 
in these coordinates into the form (3.27) without affecting its derivatives, so that 
(3.28) will still be satisfied. The required values of the coefficients X a b were 
discussed in Section 2.13. 


3.12 Co variant derivative of a vector 

Suppose that a vector field v(x) is defined over some region of a manifold. We 
will consider the derivative of this vector field with respect to the coordinates 
labelling the points in the manifold. Let us begin by writing the vector in terms 
of its contravariant components v = v a e a . We thus obtain 

d b v = (d b v a )e a + v a (d h e a ), (3.31) 

where the second term arises because, in an arbitrary coordinate system, the coor¬ 
dinate basis vectors vary with the position in the manifold. If we defined locally 
Cartesian coordinates at some point P in the manifold then in the neighbourhood 
of this point the coordinate basis vectors arc constant and so the second term would 
vanish at P (but not elsewhere, unless the manifold M is (pseudo-)Euclidean, so 
that the whole of M can be covered by a Cartesian coordinate system). 
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Using (3.13), we may write (3.31) as 

d b v = ( d b v a )e a + v a T c ab e c . 

Since a and c are dummy indices in the last term on the right-hand side, we may 
interchange them to obtain 

d b v = (, d b v a )e a + v c T a cb e a = (d b v a + v c r a cb )e a . 


The reason for interchanging the dummy indices is that we may then factor out 
e a . Thus, at any point P, we now have an expression for the derivative of a vector 
field with respect to the coordinates in terms of the basis vectors of the coordinate 
system at P. The quantity in brackets is called the covariant derivative of the 
vector components, and the standard notation for it is 6 


v/^/+rv. 


(3.32) 


Thus the derivative of the vector field v can be written in the compact notation 


<V = (y b v a )e a . 


We note that, in local geodesic coordinates about some point P, the second term 
in the covariant derivative (3.32) vanishes at P and thus reduces to the ordinary 
partial derivative. 

So far we have considered only the covariant derivative of the contravariant 
components v a of a vector. The corresponding result for the covariant components 
v a may be found in a similar way, by considering the derivative of v = v a e“ and 
using (3.14) to obtain 


d b t> a T ab V c . 


(3.33) 


Comparing the expressions (3.32) and (3.33) for the covariant derivatives of 
the contravariant and covariant components of a vector respectively, we see that 
there arc some similarities and some differences. It may help to remember that 
the index with respect to which the covariant derivative is taken (b in this case) is 
also the last subscript on the connection; the remaining indices can then only be 
arranged in one way without raising or lowering them. Finally, the sign difference 
must be remembered: for a contravariant index (superscript) the sign is positive, 
whereas for a covariant index (subscript) the connection carries a minus sign. 

We conclude this section by considering the covariant derivative of a scalar. 
The covariant derivative differs from the simple partial derivative only because 
the coordinate basis vectors change with position in the manifold. However, a 


6 In some textbooks, the covariant derivative is denoted by a semicolon, so that the covariant derivative V b v a 
would be written as v a . b . 
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scalar cj) does not depend on the basis vectors at all, so its covariant derivative 
must be the same as its partial derivative, i.e. 


V*</> = d b 4>. 


(3.34) 


3.13 Vector operators in component form 

The equations of electromagnetism, fluid mechanics and many other areas of 
classical physics make use of vector calculus in three-dimensional Euclidean 
space, employing the gradient Vr/> and the Laplacian V 2 f of scalar fields, together 
with the divergence V • v and the curl V x v of a vector field. Explicit forms for 
these arc given in many texts for useful coordinate systems such as Cartesian, 
cylindrical polar, spherical polar (typically the 11 coordinate systems in which 
Laplace’s equation separates). The covariant derivative provides a unified picture 
of all these derivatives and a direct route to the explicit forms in an arbitrary 
coordinate system. Moreover, it allows for the generalisation of these operators 
to more general manifolds. 


Gradient 

The gradient of a scalar field <f is given simply by 


v0 = (v>K = (3>y\ 


(3.35) 


since the covariant derivative of a scalar is the same as its partial derivative. 


Divergence 

Replacing the partial derivatives that occur in local Cartesian coordinates by 
covariant derivatives, which are valid in arbitrary coordinate systems, the diver¬ 
gence of a vector field is given by the scalar quantity 

V ■ v = V a v a — d a v a + T a ah v b . 

Using the result (3.26) we can rewrite the divergence as 


„ 1 

V • v = V„ v‘ — ,_ d n 


(^ a ) 


(3.36) 


where g is the determinant of the matrix [g ((/ , |. 
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Laplacian 

If we replace v by Vf> in V v then we obtain the Laplacian V 2 <^>. From (3.35), 
v = v a e a = (d a (f))e a , so the covariant components are v a = d a 4>. In (3.36), however, 
we require the contravariant components v a . These may be obtained by raising 
the index with the metric to give 

v a = g ab v b = g ab d b cf>. 

Subtituting this into (3.36), we obtain 



It is worth noting that the symbol used for the Laplacian operator often depends 
on the dimensionality of the manifold being used. In particular, the triangular 
(three-sided) symbol V 2 that is commonly used in the three-dimensional (and 
/V-dimcnsional cases) is replaced by the box-shaped (four-sided) symbol D 2 in 
four-dimensional spacetimes, in which case it is called the d’Alembertian operator. 

Curl 

The special form of the curl of a vector field, which is itself a vector, exists 
only in three dimensions. In its more general form, which is valid in higher 
dimensions, the curl is defined as a rank-2 antisymmetric tensor (see Chapter 4) 
with components 

(curl v) ab = V a v b -V b v a . 

In fact this difference of co variant derivatives can be simplified, since 

^a^b ^ ba^c ^b^a T ^ ab^c ^a^b ^b^a’ 

where the connections have cancelled because of their symmetry properties. 


3.14 Intrinsic derivative of a vector along a curve 

Normally, we think of vector fields as functions of the coordinates x a defined 
over some region of the manifold. However, we can also encounter vector fields 
that are defined only on some subspace of the manifold, and an extreme example 
occurs when the vector field v(u) is defined only along some curve x a ( u) in the 
manifold; an example might be the spin 4-vector s(t) of a single particle along 
its worldline in spacetime. We now consider how to calculate the derivative of 
such a vector with respect to the parameter u along the curve. 
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Let us begin by writing the vector field at any point along the curve G as 
v{u) = v a {u)e a {u). 


where the e a (u) are the coordinate basis vectors at the point on the curve corre¬ 
sponding to the parameter value u. Thus, the derivative of v along the curve G is 
given by 

dv dv a n de a dv a n de a dx c 

- = - e o + l> - = - e a + v -, 

du du du dit dx c du 

where we have used the chain rule to rewrite the last term on the right-hand side; 
this is a valid procedure since the basis vectors e a are also defined away from 
the curve G. Using (3.13) to write the partial derivatives of the basis vectors in 
terms of the connection, we obtain 


dv 

du 


d v a b a dx c 

—j— e a + r ac v —— e h . 
du du 


Interchanging the dummy indices a and b in the last term, we may factor out the 
basis vector, and we find that 


dv 

du 


fdv a 
\ du 



Dv a 

Du 


(3.37) 


The term in parentheses is called the intrinsic (or absolute) derivative of the 
components v a along the curve G and is often denoted by Dv a /Du as indicated. 
Similarly, the intrinsic derivative of the covariant components v a of a vector is 
given by 

Dv n dv n dx c 

—- = —- - r ° nr v h —. 

Du du du 

A convenient way to remember the form of the intrinsic derivative is to pretend 
that the vector v is in fact defined throughout (some region of) the manifold, i.e. 
not only along the curve G. In some cases of interest, this may in fact be true 
anyway; for example, v might denote the 4-velocity of some distributed fluid. We 
can now differentiate the components v a (say) with respect to the coordinates x a . 
Thus we can write 

dv a dv a dx c 
du dx c du 

Substituting this into (3.37), we can then factor out dx c /du and recognise the 
other factor as the covariant derivative V c v a . Thus we can write 
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and similarly for the intrinsic derivatives of the covariant components. It must be 
remembered, however, that if v is only defined along the curve G then formally 
(3.38) is not defined and acts merely as an aide-memoire. 


3.15 Parallel transport 

Let us again consider some curve C in the manifold, given parameterically in 
some general coordinate system by x a (u). Moreover, let O be some initial point 
on the curve with parameter u Q at which a vector v is defined. We can now think 
of ‘transporting’ v along G in such a way that 

dv 

— = 0 (3.39) 


is satisfied at each point along the curve. The result is a ‘parallel’ field of vectors 
at each point along G, generated by the parallel transport of v. 

In a (pseudo-)Euclidean manifold, the parallel transport of a vector has the 
simple geometrical interpretation that the vector v is transported without any 
change to its length or direction. This is illustrated in Figure 3.6 for a curve G 
in a two-dimensional Euclidean space (i.e. a plane). If the coordinates x a arc 
Cartesian, it is clear that the components v a of the vector field satisfy 



du 


(3.40) 



Figure 3.6 A parallel field of vectors v(n) generated by parallel transport along 
a curve G parameterised by it. 
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In an arbitrary coordinate system in the plane, however, (3.40) is no longer valid, 
and from (3.37) we see that it must be generalised to 


Dv a dv a n h dx c 

HH + Tt ‘ v 77 = 0 ' 


(3.41) 


From the basic requirement (3.39), it is clear that (3.41) is equally valid for the 
parallel transport of a vector along a curve in any (pseudo-)Riemannian manifold 
in some arbitrary coordinate system x a , although the geometrical interpretation is 
more subtle in this case. If one is willing to adopt a picture in which the (pseudo-) 
Riemannian manifold is embedded in a (pseudo-)Euclidean space of sufficiently 
higher dimension, then one can recover a simple geometrical interpretation of 
parallel transport. Consider some curve G in the (pseudo-)Riemannian manifold 
given in terms of some coordinate system in the manifold by x“(u). Let P and 
Q be two neighbouring points on the curve with affine parameter values u and 
u + 8u respectively. Starting with the vector v at P, which lies in the tangent 
space T P , shift the vector to the neighbouring point Q while keeping it parallel 
to itself. In a Euclidean embedding space, this simply means transporting the 
vector without changing its length or direction. At the point Q the vector will 
not, in general, lie in the tangent space T () , on account of the curvature of the 
embedded manifold. Nevertheless, by considering only that paid of the vector 
that is tangential to the embedded manifold at Q , we obtain a definite vector 
lying in Tq. It is straightforward to show that this vector coincides with the 
parallel-transported vector at Q according to (3.41). 

If we rewrite (3.41) as 


dv a 

du 


-T a 


,dx c 

du 


(3.42) 


then we can see that, if we specify the components v a at some arbitrary point 
along the curve, equation (3.43) fixes the components of v a along the entire 
length of the curve. If you arc worried about whether the transportation is really 
parallel, simply consider an infinitesimal displacement of the vector from some 
point P. For a small displacement we can choose locally Cartesian coordinates at 
P, in which the Fs vanish, and so setting the covaliant derivative equal to zero 
describes an infinitesimal displacement which keeps the vector parallel (dv“ = 0). 

We note here that, in at least one respect, parallel transport along curves in a 
general (pseudo-)Riemannian manifold is significantly different from that along 
curves in a (pseudo-)Euclidean space, in that it is path dependent the vector 
obtained by transporting a given vector from a point P to a remote point Q 
depends on the route taken from P to Q. This path dependence is also apparent 
in transporting a vector around a closed loop, where on returning to the starting 
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point the direction of the transported vector is (in general) different from the 
vector’s initial direction. This path dependence can be demonstrated on a curved 
two-dimensional surface, and in general can be expressed mathematically in terms 
of the curvature tensor of the manifold. We will return to this topic in Chapter 7. 


3.16 Null curves, non-null curves and affine parameters 


So far, we have treated all curves in a manifold on a equal footing. In pseudo- 
Riemannian manifolds, however, it is important to distinguish between null curves 
and non-null curves. In the former, the interval ds between any two nearby points 
on the curve is zero, whereas in the latter case ds is non-zero. The distinction 
between these two types of curve may also be defined in terms of their tangent 
vectors, and this leads to the identification of a class of privileged parameters, 
called affine parameters, in terms of which the curves may be defined. 

Consider some curve x a (u) in a general manifold. As discussed earlier, the 
tangent vector t to the curve at some point P, with respect to the parameter value 
u, is defined by (3.1). In a given coordinate system, we can write 8s = e a 8x a , 
where the e a arc coordinate basis vectors at P. We then obtain 


t 


dx 

- e 

du 


a • 


(3.43) 


From this expression, we see that the length of the tangent vector t to the curve 
x a (u) at the point P is given by 


= \gab ta t b \ 1/2 = 


Sab' 


dx a dx b 
du du 


1/2 


\g ab dx a dx b | 1/2 


du 


ds 

du 


where ds is the distance measured along the curve at P that corresponds to the 
parameter interval du along the curve. 

A non-null cutye is one for which the tangent vector at every point is not null, 
i.e. \t | ff 0. For such a curve, the length of the tangent vector at each point depends 
on the parameter u and, in general, can vary along the curve. However, we see 
that if the curve is parameterised in terms of a parameter u that is related to the 
distance s measured along the curve by u = as + b, where a and b arc constants, 
with a 0, then the length of the tangent vector will be constant along the curve. 
In this case u is called an affine parameter along the curve. Moreover, if we take 
u = s then the tangent vector (with components dx a /ds) is always of unit length. 

A null cutye is one for which the tangent vector is null, \t\ = 0, at every point 
along the curve; equivalently, the distance ds between any two points on a null 
curve is zero. Since s does not vary along the curve, we clearly cannot use it as 
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a parameter. We arc. however, free to use any other non-zero scalar parameter u 
that does vary along the curve. Moreover, even for null curves it is still possible 
to define a privileged family of affine parameters. The definition of an affine 
parameter for a null curve is best introduced through the study of geodesics. 


3.17 Geodesics 


A geodesic in Euclidean space is a straight line, which has two equivalent defining 
properties. First, its tangent vector always points in the same direction (along 
the line) and, second, it is the curve of shortest length between two points. We 
can use generalisations of either property to define geodesics in more general 
manifolds. The fixed direction of their tangent vectors can be used to define both 
non-null and null geodesics in a pseudo-Riemannian manifold, whereas clearly 
the extremal length can only be used to define non-null geodesics. In a manifold 
that is torsionless (so that (3.19) is satisfied) these two defining properties are 
equivalent, for non-null geodesics, and lead to the same curves. 7 

Let us begin by characterising a geodesic as a curve x a {u) described in terms 
of some general parameter u by the fixed direction of its tangent vector t(u). The 
equations satisfied by the functions x“ ( u ) arc thus determined by the requirement 
that, along the curve, 

^ = A(n) t, (3.44) 

du 

where A (u) is some function of u. From (3.41), we see that the components t a of 
the tangent vector in the coordinate basis must satisfy 


Dt a 

Du 


dt a 

du 


+ r fl 


dx c 

du 


A (u)f. 


Since the components of the tangent vector arc t a = dx a /du we find that the 
equations satisfied by a geodesic arc 


dfx a dx b dx c _ dx a 

du 2 bc du du du 


(3.45) 


Equation (3.45) is valid for both null and non-null geodesics parameterised 
in terms of some general parameter u. If the curve is parameterised in such a 
way that A(«) vanishes, however, then u is a privileged parameter called an 
affine parameter. From (3.44), we see that this corresponds to a parameterisation 


7 In a manifold with torsion, the two properties lead to different curves: a curve whose length is stationary with 
respect to small variations in the path is called a metric geodesic , whereas a curve whose tangent vector is 
constant along the path is an affine geodesic. 
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in which the tangent vector is the same at all points along the curve (i.e. it is 
parallel-transported), so that 


= 0 . 


dt Dt a 

— = 0 =>■ —— 

du Du 

The equations satisfied by an affinely parameterised geodesic arc thus 


(3.46) 


d 2 .x a dx b dx c 

du 2 ht du du 


(3.47) 


Since one is always free to choose an affine parameter, we shall henceforth 
restrict ourselves to this simplified form. In particular, for non-null geodesics 
a convenient affine parameter is the distance s measured along the curve. The 
geodesic equation (3.47) is one of the most important results for our study of 
particle motion in general relativity. 

Finally, we note how affine parameters arc related to one another. If we change 
the parameterisation from an affine parameter it to some other parameter u! then 
the functions x a (it ) describing G in terms of the new parameter will differ from 
the original functions x a (u). If, for some arbitrary new parameter u', we rewrite 
(3.47) in terms of derivatives with respect to u! then the geodesic equation does 
not , in general, retain the form (3.47) but instead becomes 

d 2 x a dx b dx c / d 2 u/du' 2 \ dx“ 13 481 

du' 2 hc du' du' \ du/du' J du' 

It is clear from (3.48) that if u is an affine parameter then so too is any linearly 
related parameter u! = au + b , where a and b arc constants (i.e. they do not 
depend on position along the curve) and a ^ 0. 


3.18 Stationary property of non-null geodesics 

Let us now consider non-null geodesics as curves of extremal length between two 
fixed points A and B in the manifold. Suppose that we describe the curve x a {u) 
in terms of some general (not necessarily affine) parameter u. The length along 
the curve is 

L — f ds= [ | g ab x a x b \ 1 ^ 2 du, 

J A J A 

where the overdot is a shorthand for cl/du. Now consider the variation in path 
x a iu) —> x a («) + 8x a (m), where A and B arc fixed. The requirement for x a (u) to be 
a geodesic is that 8L = 0 with respect to the variation in the path. This is a calculus- 
of-variations problem, (3.66), in which the integrand F — s — \g ab x a x b \ 1 ^ 2 . 
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If we substitute this form for F directly into the Euler-Lagrange equa¬ 
tions (3.67), i.e. 

±( 3 f\- 3 L =0 , 

du \dx c ) dx c 


then we obtain 


b = 0 

du \ s J 2s 


Noting that g ac = (d b g ac )x b , the u-derivative is given by 

1 


d_ ( gac*" 
du 


(^bSac)^ X + Sac^ • Sac^ 


Substituting this expression back into (3.49) and rearranging yields 


g ac x a + (d b g ac )k a k -^{d c g ab )x a x = - )g ac x a - 

s, 


(3.49) 


(3.50) 


By interchanging dummy indices, we can write (d b g ac )x a x b = \(d b g ac + 
d a g bc )x a x b . Substituting this into (3.50), multiplying the whole equation by g dc 
and remembering that g dc g ac = 8 d , we find that 


• + d a8bc ~ d c8ab)x a X = T 
S 


Finally, using the expression (3.21) for the connection in terms of the metric and 
relabelling indices, we obtain 


r + r a bc x b k c 



(3.51) 


Comparing this equation with (3.48) we see that the two arc equivalent. We also 
see that, for a non-null geodesic, an affine parameter u is related to the distance 
s measured along the curve by u = as + b, where a and b arc constants (a 0). 


3.19 Lagrangian procedure for geodesics 

In order to obtain the parametric equations x a = x“ ( u) of an affinely parameterised 
geodesic, we must solve the system of differential equations (3.47). Bearing in 
mind that the equations (3.21), which define the T a bc , are already complicated, 
it would seem a formidable procedure to set up the geodesic equations, let alone 
solve them. Nevertheless, in the previous section we found that the equations 
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for a non-null geodesic arise very naturally from a variational approach. Looking 
back at the derivation of (3.51), however, we note this requires that s f 0. Thus 
the proof is not valid for null geodesics. Fortunately, it is possible to set up a 
variational procedure which generates the equations of an affinely parameterised 
geodesic and which remains valid for null geodesics. This very neat procedure 
also produces the connection coefficients T a bc as a spin-off. 

In standard classical mechanics, one can describe a system in terms of a set of 
generalised coordinates x a that arc functions of time t. These coordinates define 
a space with a line element 


ds 2 = g ab dx a dx b , 

which, in classical mechanics, is called the configuration space of the system. One 
can form the Lagrangian for the system from the kinetic and potential energies, 

L = T-V= \g ab x a x b - V(x), 

where x a = dx a /dt. By demanding that the action 

r‘f 

S — J Ldt 

is stationary with respect to small variations in the functions x a (t), the equations 
of motion of the system arc then found as the Euler-Lagrange equations 

d / dL\ dL _ 

dt \dx a ) dx a 

This should all be familial - to the reader (but is discussed in more detail in 
Chapter 19). Less familial - , perhaps, is how the equations of motion look if we 
write them out in full: 

x a + T a bc x b x c = -g ab d b V. 


These are just the equations of an affinely parameterised geodesic with a force 
term on the right-hand side. In this case, the T a bc are the metric connections of 
the configuration space. If the forces vanish then Lagrange’s equations say that 
‘free’ particles move along geodesics in the configuration space. 

Thus, by analogy, in an arbitrary pseudo-Riemannian manifold we may obtain 
the equations for an affinely parameterised (null or non-null) geodesic x a (u ) by 
considering the ‘Lagrangian’ 


L = 8abX a x b , 



80 


Vector calculus on manifolds 


where x a = dx a /du and we have omitted the iiTelevant factor As can be shown 
directly, substituting this Lagrangian into the Euler-Lagrange equations 


d / c)L \ 8L _ 

clu l dx a ) dx a 


(3.52) 


yields, as required, 

x a + T a bc x b x c = 0. 


(3.53) 


Performing this calculation, one finds that nowhere does it require s f 0 and so 
is valid for both null and non-null geodesics. Thus the Euler-Lagrange equations 
provide a useful way of generating the geodesic equations, and the connection 
coefficients may be extracted from the latter. 

We note that, in seeking solutions of the geodesic equations (3.53), it often 
helps to make use of the first integral of the equations. For null geodesics, the 
first integral is simply 


ab x x 


= 0 , 


(3.54) 


whereas, for non-null geodesics, if we choose the parameter u = s then 


\gabX a X b \ = 1 . 


(3.55) 


These results can prove extremely useful in solving the geodesics equations. 

Demonstrating the equivalence of the geodesic and Euler-Lagrange equations 
allows us to make a useful observation. If the g ab do not depend on some particular 
coordinate x d (say) then (3.52) shows that 

dL , b 

jrj = gdbX = constant. 

However, x b = t b , where t is the tangent vector to the geodesic, and so we find that 


t d = constant. 


Thus, we have the important result that if the metric coefficients g ab do not depend 
on the coordinate x d then the dth covariant component t d of the tangent vector is 
a conserved quantity along an affinely parameterised geodesic. We will use this 
result often in our discussion of particle motion in general relativity. 
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3.20 Alternative form of the geodesic equations 

The most common form of the geodesic equations is that given in (3.53). It is 
sometimes useful, however, to recast the geodesic equations in different forms. 
Thus, we note here an alternative way of writing them that will be of particular 
practical use when we come to study particle motion in general relativity. 

From (3.46), for a geodesic we have dt/du = 0. In some coordinate system 
we may write this equation in terms of the intrinsic derivative of the covariant 
components of the tangent vector as 

Dt n dt n , dx c 

— = — - T b ar t h -= 0. 

Du du du 

Remembering that t c = x c — dx c /clu, we thus have 

1 T^b f f c 

L a 1 ac l b l ’ 

which, on rewriting the connection coefficients T b ac using (3.21), becomes 

ifl (^aSdc ^cSad ^d8ac)^D 2 (^aSdc 4“ ^cSad ^dSac ^ • 


Using the symmetry of the metric tensor, we see that the last two terms in the 
summation on cl and c cancel. Thus, we obtain a useful alternative form of the 
geodesic equations, 


K, = \(.d a g cd )t c t d . 


(3.56) 


From this equation, we may immediately verify our earlier finding that if the 
metric g cd does not depend on the coordinate x“ then t a = constant. 


Appendix 3A: Vectors as directional derivatives 

In an arbitrary manifold, the formal mathematical definition of a tangent vector 
to a curve at some point P is in terms of the directional derivative along the 
curve at that point. In particular, let us consider some curve G defined in terms of 
an arbitrary coordinate system by x“(u). In addition, suppose that some arbitrary 
scalar function f(x a ) is defined on the manifold. At any point P on the curve, the 
directional derivative of / is defined simply as 

df _ 8f dx a 
du dx a du 

at that point. However, t a = dx a /du gives the components of a tangent vector to 
the curve at P and, since / is arbitrary, we may write 
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Thus, the components t“ define a unique directional derivative, which we may 
identify as the tangent vector t. Moreover, it follows that the differential operators 
d/dx a arc the coordinate basis vectors e a at P. i.e. they arc the tangent vectors to 
the coordinate curves at this point. 

In fact, any set of vector components v a defines a unique directional derivative 

v a ^~ a , (3.57) 

ox a 

and, conversely, this directional derivative defines a unique set of components v a . 
We may thus identify (3.57) as the vector v. Thus the definition of a vector as a 
directional derivative replaces the more familial - notion of a directed line segment, 
which cannot be generalised to non-Euclidean manifolds. It is straightforward 
to verify that all the usual rules of vector algebra and the behaviour of the 
components v a under coordinate transformations follow immediately from (3.57). 


Appendix 3B: Polar coordinates in a plane 

As a simple example of the material presented in this chapter, let us consider 
the special case of a two-dimensional Euclidean plane. The most common way 
of labelling points in a plane is by using Cartesian coordinates (x, y), but it is 
sometimes convenient to use plane polar coordinates (p, r/>). The two coordinate 
systems are related by the equations 

P = (x 2 + y 2 ) 1/2 , 4> = tan -1 (y/x ), 

and their inverses 

x = pcosc/>, y = psin</>. 

The transformation matrices relating these two sets of coordinates are 



which are easily shown to be inverses of one another. For convenience, in the 
following we will sometimes refer to the polar coordinates as the coordinate 
system x a {a = 1,2). 



Appendix 3B: Polar coordinates in a plane 


83 



Figure 3.7 Labelling points in a plane with Cartesian coordinates and plane 
polar coordinates. Examples of basis vectors for the two systems are also shown. 

Basis vectors Let us now consider the coordinate basis vectors in each system. 
The coordinate curves for each system are shown as dotted lines in Figure 3.7 
and the basis vectors are tangents to these curves. For the Cartesian coordinates, 
e x and e y have the special property that they are the same at every point P in the 
plane. They are of unit length and point along the x- and y-directions respectively, 
and we can write 

ds = dxe x + dye y . 

In plane polar coordinates this becomes 

ds = (cos 4> dp — p sine/) d<p)e x + (sin cj)dp + p cost/) d 4>)e y , (3.58) 

and so, using the definition (3.3) of the coordinate basis vectors, we obtain 

e p = cos^>e x + sin^>e v , (3.59) 

= — p sine/) e x + p cos 4>e y (3.60) 

Alternatively, we could have arrived at the same result using the transformation 
equations (3.9) for basis vectors. The basis vectors e p and <?,/, are shown in 
Figure 3.7. 


Metric components Substituting the expressions (3.59) and (3.60) into the result 
Sab = e a' e b’ we f> n d that in polar coordinates 
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Thus, we have 


ds 2 = ds ■ ds = g ab dx a dx b = dp 2 + p 2 dfr, 


which matches the result obtained using (3.58) directly. The matrix [ g ah ] is the 
inverse of the matrix [ g ab \ and thus is given by 


[g ab ] = 


1 0 
0 1/p 2 


Dual basis The dual basis vectors given by e“ = g ab e b are 
e p = g pp e p + g p<t> e <j) = e p , 

e* = g+ p e p + g**e4 > = 4 ^, 

where no summation is implied over p or <j>. These dual basis vectors are easily 
shown to obey the reciprocity relation e a -e h = 8 b . 


Derivatives of basis vectors Since e x and e y are constant vector fields, the 
derivatives of the polar coordinate basis vectors arc easily found as 


-= —(cos<(>e v + sin<(>e v ) = 0, 

dp dp y 

de p d 1 

—- = —(cos^^ + sin^e ) = — sin<)>e v + cos(/>6 — — e^. 
d<p d<p J y p ^ 

These have a simple geometrical picture. At each of two nearby points P and 
Q the vector e p must point away from the origin, and so in slightly different 
directions. The derivative of e p with respect to f> is just the difference between 
between e p at P and Q divided by 8<j) (the angle between them). The difference 
in this case is clearly a vector parallel to e^, which makes the above results 
reasonable. Similarly, 


dp 

d(f 


— (—psinc(>e T + pcos<(>e ) = — sin f>e x + cos 4>e y = - e*, 
dp 7 y p ^ 


p sin cf) e x + p cos (f) e y ) = — p cos (j)e x — p sincf) e y = — p e p . 
dcp y y v 


The student is encouraged to explain these formulae geometrically. 
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Connection coefficients Using the general formula d c e a = T b ac e b , we can now 
read off the connection coefficients in plane polar coordinates: 

de r 


i =0 

dp 

de p 1 

dcf p ^ 


1 

T L = ~ e 4 

dp p 


de 

~dl 

d<j) 


= ~pe p 


r p =0 

1 pp o, 


=o 

1 pp u 


r* p * = - 
P 


r p p4> = o, 
r% p = o, 
r p M = -p, r^ = o. 


r % P = - 
P 


where no summation is assumed over repeated indices. Thus, although we 
computed the derivatives of e p and <? (/j by using the constancy of e x and e y , the 
Cartesian basis vectors do not appeal - in the above equations. The connection’s 
importance is that it enables one to express these derivatives without using 
any other coordinates than polar. We can alternatively calculate the connection 
coefficients from the metric using the general result (3.21). For example. 


r< %4> = \.8 a,t> ( d 4>8ap + d p8a<t> ~ d a8p ■*). 

where summation is implied only over the index a. Since p p,l> = 0 and = 1 /p 2 , 
we have 

^ <h p(t = 2fV+g*p ^p8<jut> ~ ^4>8p4>) = 2p2^p8<P4> = ffpi^p^P ) = ~- 

This is the same expression for h'^ as that derived above. Indeed, this method of 
computing the connection is generally far more straightforward than calculating 
the derivatives of basis vectors. 


Covariant derivative Given the connection coefficients, we can calculate the 
covariant derivative of a vector field in polar coordinates. As an example of its 
use, let us find an expression for the divergence V • v of a vector field. This is 
given by 

v-v = v fl u fl = dX + r%y. 

Now, the contracted connection coefficients are given by 

r a = r p + a = - 

pa PP^ P<P p ’ 

rV = r% p + r^ = o, 
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so we have 


V-v = 


dv p 1 n dv* 

-h -v p -\ - 

dp p df) 


1 d 

pdp 


(. PV P ) + 


dv* 
30 ' 


This formula may not be immediately familiar. The reason for this is that most 
often a vector v is expressed in terms of the normalised basis vectors e p = e p 
and = e^/p. In this normalised basis the vector components are v p = v p and 
v* = pv*, and the divergence takes its more usual form 


V-v = 


1 d 

pdp 


(pv p ) + 


1 dv* 
p df ‘ 


Geodesics Finally, let us consider a geodesic in a plane. We already know that 
the answer is a straight line, and this is trivially proven in Cartesian coordinates. 
For illustration, however, let us perform the calculation the hard way, i.e. in plane 
polar coordinates. There arc two geodesic equations, 

d 2 x“ dx b dx c 

ds 2 bc ds ds 

for a = p, 4>, where we arc using the arc length s as our parameter along the 
geodesic. The only non-zero connection coefficients are = —p and T*^ = 
r^ p = 1/p. Thus, writing out the geodesic equations for a = p and a = 0, 
we have 


o' 

II 

-A-1 co 

4s b 3 

Cl. 

1 

C-|<N 
<N Jf 2 

(3.61) 

d 2 f> 2 dp d<f 
ds 2 p ds ds 

(3.62) 


Also, since in a Euclidean plane we can only have non-null geodesics, a first 
integral of these equations is provided by 


dx a dx b 

Scib~j 7 1 

ds ds 



(3.63) 


Of course, this could have been obtained simply by dividing through ds 2 = 
dp 2 + p 2 df> 2 by ds 2 . 

Equation (3.62) can be written as 


1 d 
p 2 ds 



= 0, 


from which we obtain 



— k = constant. 


(3.64) 
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Inserting this into (3.63), we find that 

/ 1 _*v 2 

ds V P 2 ) ' 


(3.65) 


The shape of the geodesic is what really interests us, i.e. p as a function of cf or 
vice versa. Dividing (3.64) by (3.65), we obtain 

7,2 \ - 1/2 


clef k 
dp p 2 

which can be integrated easily to give 


1 - 


= 4>o + cos 


where cf Q is the integration constant. The shape of the geodesic is given by 
pcos(cf) — 4> q ) = k, 

which, on expanding the cosine and using x = p cos (b and y = psincf, gives 


x cos cf o + y sin (f Q = k. 

This is the general equation of a straight line. Thus we recover the familial - result 
in an unfamiliar coordinate system. 


Appendix 3C: Calculus of variations 

The calculus of variations provides a means of finding a function (or set of 
functions) that makes an integral dependent on the function(s) stationary, i.e. 
makes the value of the integral a local maximum or minimum. Let us consider 
the path integral 

r B 

1= F{x a ,x a ,u)du, (3.66) 

J A 

where A, B and the form of the integrand F are fixed, but the ‘curve’ or path 
x a (u) has to be chosen so as to make stationary the value of /. From (3.66), 
we see that we are considering quite a general case, in which the integrand 
F is a function of the 2 N independent functions x a and x a = dx a /clu and the 
parameter u. 

Now consider making an arbitrary variation x a (u) — * x“(it) + 8x a (u) in the 
path, keeping the endpoints A and B fixed. The corresponding first-order variation 
in the value of the integral is 

r B r B / dF dF \ 

81= 8F du= / (- 8x a - 1—— 8x a ) du. 

Ja ja \ dx a dx a ) 
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Integrating the last term by parts and requiring the variation 81 to be zero, we 
obtain 


81 = 


r dF i 

B 

f B 

' dF 

d i 

( dF y 

— Sx a 


1 

A 

_dx a 

du ' 

\dx a ) _ 


8x a du = 0. 


Since A and B arc fixed, the first term vanishes. Then, since 8x a is arbitrary, our 
required extremal curve x a (it) must satisfy the N equations 


d , 

( dF \ 

dF 

— 


- —= 0. 

du 

{dx a ) 

dx a 


(3.67) 


These arc the Euler-Lagrange equations for the problem. 


Exercises 


3.1 Show that, in general, e a = g ab e b and e“ = g ab e b . Show also that, under a coordinate 
transformation. 


dx b 

dx' a 


and 


dx' a 

dx b 


3.2 Calculate the coordinate basis vectors e' a of the coordinates system x’“ in Exercise 2.3 
in terms of the coordinate basis vectors e a of the Cartesian system. Hence verify 
that the metric functions g' ah agree with those found earlier. Calculate the dual basis 
vectors e ,a in the primed system and hence the quantities g' ab . Find the contravariant 
and covariant components of <?, in the primed basis. Hence verify that e x is of unit 
length. 

3.3 For any metric g ab show that g ab g ab = N, where N is the dimension of the manifold. 

3.4 Show that the affine connection can be written as T b ac = e b ■ d c e a . Show further that, 
in a torsionless manifold, d c e a = d a e c . 

3.5 Show that, under a coordinate transformation, the affine connection transforms as 


dx ta dx f dx s dx d dx f d 2 x’ a 

bc dx d dx lb dx ,c fs dx lb dx ,c dx d dxf 

3.6 For a diagonal metric g ab , show that the connection coefficients are given by (with 
c and no summation over repeated indices) 


rv = o, r b aa = -- i -d b g a a, 

2 8bb 

r a ha = r\ b = d b (in vlsj) , T\ a = d a (in yj«j) . 


3.7 Let g be the determinant of the matrix [g ab \. By considering the cofactor of the 
element g ab in this determinant, or otherwise, show that d c g — gg ab (d c g ab ). 




Exercises 


3.8 In a manifold with non-zero torsion, show that the affine connection defined by 
(3.11) may be written as 

r\c = \{T a cb + T c \ - T bc a ), 

where { “ } is the metric connection defined by the right-hand side of (3.21) and T a hc 
is the torsion tensor defined in (3.18). Defining an index symmetrisation operation 
such that T\ bc) = ^(T a bc + r a cb ), show further that 


r V) - 


be 


fbc) ■ 


3.9 In a manifold with non-zero torsion , show that the condition \'“ h> . = 0 implies that 
d a g bc — 0 but not vice versa. Show further that, under a coordinate transformation 
of the form 

X ,a = x a - x a P + \ c {P)(x b - x b p)(x c - x c P ), 

the affine connection at the point P in the new coordinate system is given by 

r'V(P) = \r\ c {p) 


and hence the transformation does not yield a set of geodesic coordinates. Is it still 
possible to define local Cartesian coordinates in a manifold with non-zero torsion? 

3.10 Show that, for the covariant components v a of a vector, the covariant derivative and 
the intrinsic derivative along a curve are given respectively by 


Va = Va - rC ab V c 


and 


Dv a dv a 6 dx c 
Du du ac b du 


3.11 Show that for a vector field with contravariant components v b to have a vanishing 
covariant derivative V a v b everywhere in a manifold, it must satisfy the relation 


(d b r d ac - d c r d ab + r ac T d eb - r ab r d ec ) v ° = o. 


Hint: Use the fact that partial derivatives commute. 

3.12 If a vector field v a vanishes on a hypersurface S that bounds a region V of an 
/V-dimensional manifold, show that 

f i^y)V-gd N x^ 0. 

J v 

3.13 On the surface of a unit sphere, ds 1 — d(fi + sirr 0 d(jf. Calculate the connection 
coefficients in the ( 6 , cf>) coordinate system. A vector v of unit length is defined 
at the point ( 9 0 ,0) as parallel to the circle cf> — 0. Calculate the components of v 
after it has been parallel-transported around the circle 9 — 6 0 . Hence show that, 
in general, after parallel transport the direction of v is different but its length is 
unchanged. 
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3.14 If the two vectors with contravariant components v a and uf are each parallel- 
transported along a curve, show that v a w a remains constant along the curve. Hence 
show that if a geodesic is timelike (or null or spacelike) at some point, it is timelike 
(or null or spacelike) at all points. 

3.15 An affinely parameterised geodesic x a (u) satisfies 

d 2 x a d xb d xC q 
du 2 bc du du 

Show that the form of this equation remains unchanged by an arbitrary coordinate 
transformation x a —> x"‘. Find the form of the geodesic equation for a geodesic 
described in terms of some general (non-affine) parameter A. Hence show that all 
affine parameters are related by a linear transformation with constant coefficients. 

3.16 If x M (A) is an affinely parameterised geodesic, show that 


D/ 

"da 


= 0 , 


where zz M = dx^/d A. Hence show that the geodesic equations can be written as 
du„ 1 

—— = -{d„gJ)u v u\ 
dX 2 V 

3.17 By substituting the ‘Lagrangian’ L — g ab x a x b into the Euler-Lagrange equations, 
show directly that 

x“ + T a bc x b x c = 0, 


where the dots denote differentiation with respect to an affine parameter. 

3.18 By transforming from a local inertial coordinate system ijf in which 

ds 2 = c 2 dT 2 = 17 ^ cl^ dg", 

to a general coordinate system x 11 , show that freely falling particles obey the geodesic 
equations of motion 

d 2 x x > dx^ dx v 

-h r A -= 0, 

dr 2 dr dr 

where 

pA _ar A d 2 f“ 

^ d£ a dx»dx v ' 

3.19 By considering the ‘Lagrangian’ L = g ab x a x b , derive the equations for an affinely 
parameterised geodesic on the surface of a sphere in the coordinates (6, </>). Hence 
show that, of all the circles of constant latitude on a sphere, only the equator is a 
geodesic. Use your geodesic equations to pick out the connection coefficients in 
this coordinate system. 

3.20 In the 2-space with line element 

, dr 2 + r 2 d 6 2 r 2 dr 2 

ds 2 = 


r z — a* 


(r 2 — a 2 ) 2 ' 
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where r > a, show that the differential equation for the geodesics may be written as 


a 


2 



+ a 2 r 2 = Kr\ 


where K is a constant such that K — 1 if the geodesic is null. By setting r d6/dr = 
tan <j), show that the space is mapped onto a Euclidean plane in which (r, </>) are 
taken as polar coordinates and the geodesics are mapped to straight lines. 
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Tensor calculus on manifolds 


The coordinates with which one labels points in a manifold arc entirely arbitrary. 
For example, we could choose to parameterise the surface of a sphere in terms 
of the coordinates ( 6 , 0), taking any point as the north pole, or we could use 
any number of alternative coordinate systems. It is also clear, however, that our 
description of any physical processes occurring on the surface of the sphere should 
not depend on our chosen coordinate system. For example, at any point P on the 
surface one can say that, for example, the air temperature has a particular' value 
or that the wind has a certain speed in a particular - direction. These respectively 
scalar - and vector physical quantities do not depend on which coordinates are used 
to label points in the surface. Thus in, order to describe these physical fields 
on the surface, we must formulate our equations in a way that is valid in all 
coordinate systems. We have already dealt with such a description for scalar - and 
vector quantities on manifolds, but now we turn to the generalisation of these 
ideas to quantities that cannot be described as a scalar - or a vector. This requires 
the introduction of the concept of tensors. 


4.1 Tensor fields on manifolds 

Let us begin by considering vector fields in a slightly different manner. Suppose 
we have some arbitrary vector field, defining a vector t at each point of a manifold. 
How can we obtain from t a scalar - field? Clearly, the only way to do this is to 
take the scalar product of t with a vector v from another vector field. Thus, at 
each point P in the manifold, we can think of vector t in T P as a linear function 
t(-) that takes another vector in T P as its argument and produces a real number. 
We can denote the number produced by the action of t on a particular - vector v by 


t(v)=t-v. 


(4.1) 
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It is now clear how we can generalise the notion of a vector: in the tangent 
space T P , we can define a tensor t as a linear map from some number of vectors 
to the real numbers. The rank of the tensor is the number of vectors it has for its 
arguments. For example, we can write a third-rank tensor as t{-, •, •). Once again, 
we denote the number that the tensor t produces from the vectors u, v and w by 

t(u, V, w). 

The tensor is defined by the precise set of operations applied to the vectors u, v 
and w to produce a scalar. Notice, however, that the definition of a tensor does 
not mention the components of the vectors; a tensor must give the same real 
number independently of the reference system in which the vector components 
are calculated. If at each point P in some region of the manifold we have a tensor 
defined then the result is a tensor field in this region. 

In fact we have already encountered examples of tensors. Clearly, from our 
above discussion, any vector is a rank-1 tensor. Higher-rank tensors thus constitute 
a generalisation of the concept of a vector. For example, a particularly important 
second-rank tensor is the metric tensor g, which we have already met. This defines 
a lineal - map of two vectors into the number that is their inner product, i.e. 

g(u,v) = u v. 

We will investigate the properties of this special tensor shortly. Finally, we note 
also that a scalar function of position <f>(x) is a real-valued function of no vectors 
at all, and is therefore classified as a zero-rank tensor. 

The fact that a tensor is a linear map of the vectors into the reals is particularly 
useful. For simplicity, let us consider a rank-1 tensor. Linearity means that, for 
general vectors u and v and general scalars a and /3, 

t(au + /3v) = at(u) + /3t(v). 

Similar expansions may be performed for tensors of higher rank. For a second-rank 
tensor, for example, we can write 

t(an + /3v, yw + ez) = at(u, yw + ez) + /3t (v, yw + ez) 

= ayt{u, w) + aet(u,z ) + /3yt{v , w) + /3ef(v,z). 


4.2 Components of tensors 

When a tensor is evaluated with combinations of basis and dual basis vectors 
it yields its components in that particular basis. For example, the covariant and 
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contravariant components of the rank-1 tensor (vector) in (4.1) in the basis e a arc 
given by 

t(e a ) = t a and t(e a ) = f. 


Consider now a second-rank tensor t(-, •). Its covariant and contravariant compo¬ 
nents are given by 


t(e a , e b ) = t ab and t(e a , e b ) = t ab . 


For tensors of rank 2 and higher, however, we can also define sets of mixed 
components. For a rank-2 tensor there arc two possible sets of mixed components. 


t(e a , e b ) = t a b and t(e a , e b ) = t b . 


For a general rank-2 tensor these two sets of components need not be equal. 
The contravariant, covariant and mixed components of higher-rank tensors can be 
obtained in an analogous manner. 

The components of a tensor in a particular basis set specify the action of the 
tensor on any other vectors in terms of their components. For example, using the 
linearity property, we find that 

t(u,v) =t{u a e a , v b e b ) = t ab u a v b . 

To obtain this result, we expressed u and v in terms of their contravariant compo¬ 
nents. We could have written either vector in terms of its contravariant or covariant 
components, however. Hence we find that there arc numerous equivalent expres¬ 
sions for t(u, v) in component notation: 


t{u, V) = t ab u a v b = t ab u a v b = t a h u a v b = t a b u a v b . 


This illustrates the general rule that the subscript and superscript positions of a 
dummy index can be swapped without affecting the result. 


4.3 Symmetries of tensors 

A second-rank tensor t is called symmetric or antisymmetric if, for all pairs of 
vectors u and v. 


t(u, v) = ±f(v, n). 
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with the plus sign for a symmetric tensor and the minus sign for an antisym¬ 
metric tensor. Setting u = e a and v = e b , we see that the covaliant components 
of a symmetric or antisymmetric tensor satisfy t ab = ±t ba . By using different 
combinations of basis and dual basis vectors we also see that, for such a tensor, 
t ah = ±t ba and t a b = ±t b a . 

An arbitrary rank-2 tensor can always be split uniquely into the sum of its 
symmetric and antisymmetric parts. For illustration let us work with the covariant 
components t ab of the tensor in some basis. We can always write 


tab 2 (.tab tb a ) 2 if ah t ba ), 


which is clearly the sum of a symmetric and an antisymmetric paid. A notation 
frequently used to denote the components of the symmetric and antisymmetric 
parts is 


t(ab) — 2 (tab ~b t ba ) 


and 


t[ab] — 2 (fab t ba ). 


In an analogous manner, a general rank-A tensor t(u,v,... ,w) is symmetric 
or antisymmetric with respect to some permutation of its vector arguments if 
its value after permuting the arguments is equal to respectively plus or minus 
its original value. From an arbitrary rank-A tensor, however, we can always 
obtain a tensor that is symmetric with respect to all permutations of its vector 
arguments and one that is antisymmetric with respect to all permutations. In terms 
of the tensor’s co valiant components, these symmetric and antisymmetric parts arc 
given by 


t(ab...c) = —(sum over all permutations of the indices a,b,..., c ), 

t\ab...c] = —(alternating sum over all permutations of the indices a, b,..., c). 


For example, the covariant components of the totally antisymmetric part of a 
third-rank tensor are given by 


t\abc\ (,i.tabc t acb ~\~ t cab t cba ~\~t bca t bac ). 


We may extend the notation still further in order to define tensors that arc 
symmetric or antisymmetric to permutations of particular subsets of their indices. 
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To illustrate this, let us consider the covariant components t abcd of a fourth-rank 
tensor. Typical expressions might include: 


t(ab)cd 2 if abed ^bacd)’ 
2 0 abed ^a deb) 


*a[b\c\d\ 

t{a\b\cd) 


f, if abed ~h 1 a hdc T" Idbac flhea T" t c bda ~h I ebad )> 


[ab](cd) 


= k t. 


ab(cd) 


tba(cd)\ 


2 [2 if abed t abde) 2 i^bacd ^badc)\ 

4 if abed hibdc f)acd l bade)- 


The symbols 11 arc used to exclude unwanted indices from the (anti-) 
symmetrisation implied by () and [ ]. 


4.4 The metric tensor 

The most important tensor that one can define on a manifold is the metric tensor 
g. This defines a linear map of two vectors into the number that is their inner 
product, i.e. 


g(n, v) = u ■ v. 


(4.2) 


From this definition, it is clear that g is a symmetric second-rank tensor. Its 
covariant and contravariant components arc given by 


8ab=gif a ^e h )=e a -e h and g ah = g{e a ,e h ) = e a ■ e b , 


which, from (4.2), clearly match our earlier definitions. As we showed in Chap¬ 
ter 3, the matrix [g o/) ] containing the contravariant components of the metric 
tensor is the inverse of the matrix [g ab \ that contains its covariant components. 
The mixed components of g arc given by 


g(e h ,e a )=g(e a ,e b ) = 8 b a , 


where the last equality is a result of the reciprocity relation between basis vectors 
and their duals. 
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4.5 Raising and lowering tensor indices 

The contravariant and covariant components of the metric tensor can be used 
for raising and lowering general tensor indices, just as they arc used for vector 
indices. As we have seen, when a tensor acts on different combinations of basis 
and dual basis vectors it yields different components. Consider, for example, a 
third-rank tensor t. Its covariant components arc given by 

C/,, C c ) tabc > (4-3) 

whereas one possible set of mixed components of the tensor is given by 

t(e a ,e h ,e c ) = t ab c . 

As we stated earlier, in general these two sets of components will differ, since 
the basis and dual basis vectors arc related by the metric: e c = g cc \e d . Thus, for 
example, 

1 abc Scd^ab ■ 

In a si mi lar way we can raise or lower more than one index at a time. For example, 

t a bc = 8 ad Scetdb- 


4.6 Mapping tensors into tensors 

Tensors can be thought of not just as maps between vectors and real numbers 
but also as maps between tensors and other tensors. Consider, for example, a 
third-rank tensor t , but let us not ‘fill’ all of its argument ‘slots’ with vectors. If, 
for instance, we fill just its last slot with some fixed vector «, we have the object 

*(•,',«)• (4-4) 

What sort of object is this? Well, it is clear that, if we supply two further vectors to 
this object, we will obtain a real number. Thus the object (4.4) is itself a second- 
rank tensor, which we could denote by s (say). Thus the third-rank tensor t has 
‘mapped’ the vector u into the second-rank tensor s. The covariant components 
(say) of s are given by 

Sab = s ( e a’ e b ) = t(*a> e b> «) = f abX ^ 

where, in the last slot, we have expressed u as u c e c . By expressing this vector as 
u c e c instead, we obtain the equivalent expression s ab = t ab c u c . 

As a further example of mapping between tensors, let us fill both the first and 
last slots of t with fixed vectors v and u respectively to obtain the object 


t(v, ■, u). 
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Clearly, this object is a first-rank tensor (or vector), which we denote by w. Thus 
the third-rank tensor t has mapped the two vectors v and u into the vector w. The 
covariant components (say) of w arc 

w b = w(e b ) = t(v,e h ,u), 

which can be expressed in several equivalent ways, i.e. 

™b = tabcV a U C = t a bc V a U C = t ab C V a U c = t a b cV a U c . 

The number of free indices in such expressions is the rank of the resulting tensor. 


4.7 Elementary operations with tensors 

Tensor calculus is concerned with tensorial operations, that is, operations on 
tensors which result in quantities that arc still tensors. We now consider some 
elementary tensorial operations. 


Addition (and subtraction) 

It is clear from the definition of a tensor that the sum and difference of two tensors 
of rank N arc both themselves tensors of rank N. For example, the covariant 
components (say) of the sum s and difference d of two rank-2 tensors are given 
straightforwardly by 


s ab — s ( e a’ e b) ~ K e a’ e b) + r ( e a’ e b) ~ t ab + r ab * 
dab d(e a , ef) t(e a , &b) r(e a > ^b) tab r ab- 


(4.5) 


Multiplication by a scalar 

If t is a rank-/V tensor then so too is at, where a is some arbitrary real constant. 
Clearly, its components are all multiplied by a. 

Outer product 

The outer or tensor product of two tensors produces a tensor of higher rank. The 
simplest example of an outer product is that of two vectors. This is defined as the 
rank-2 tensor, denoted by u <g)v, such that 

(u®v)(p,q) = u{p)v(q), 

where p and q arc arbitrary vector arguments (this notation is not to be confused 
with the vector product u x v of two vectors, which is itself a vector). Note that 
the outer product is not, in general, commutative, so that u®v and v arc 
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different rank-2 tensors. The covariant components (say) of u ® v in some basis 
are given by 

(« <g> v)(e a , e b ) = u(e a )v{e h ) = u a v b . 

The outer product of higher-rank tensors is a simple generalisation of the outer 
product of two vectors. For example, the outer product of a rank-2 tensor t with 
a rank-1 tensor s is defined by 

(t®s)(p,q.r) = t(p,q)s(r). 

This is a rank-3 tensor, which we could call h. The mixed components, for 
instance, of this tensor are given by 

h a bc = t(e a ,e b )s{e c ) = t a b s c . (4.6) 

In general, the outer product of an ATh-rank tensor with an Mth-rank tensor will 
produce an (N + M)th-rank tensor. 

Contraction (and inner product) 

The contraction of a tensor is performed by summing over the basis and dual 
basis vectors in two of its vector arguments, and it results in a tensor of lower 
rank. Let us take as an example a rank-3 tensor li and consider the quantity 

q(-)=h(e a , ■,e a ). 

This is clearly a rank-1 tensor with covariant components (say) given by 

q b = h(e a ,e b , e a ) = h a ba . (4.7) 

Thus in terms of tensor components, contraction amounts to setting a subscript 
equal to a superscript and summing, as the summation convention requires. In 
general, performing a single contraction on an ATh-rank tensor will produce a 
tensor of rank N — 2. 

Contraction may be combined with tensor multiplication to obtain the inner 
product of two tensors. For example, if h a bc were in fact given by (4.6), then 
(4.7) could be written as 

q b = t(e a ,e h )s(e c ) = t a b s a , 

which is the inner product of the tensors t and s. Alternatively, one could view 
the q b as a contraction of the rank-3 tensor having components t a b s c , which is 
the outer product t ®s. 

If two tensors t and s are rank 2 or lower then we can denote their inner product 
unambiguously by t -s. Note, however, that in general such an inner product is 
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not commutative. For example, if t is a rank-2 tensor and s is rank 1 then the 
contravariant components (say) of the vectors t ■ s and s ■ t arc respectively 

t ab s b and t ab s a . 

Clearly, the ‘dot’ notation for the inner product becomes ambiguous if either 
tensor is rank 3 or higher, since there is then a choice concerning which indices 
to contract. 


4.8 Tensors as geometrical objects 

We have seen that a rank-1 tensor t(-) can be identified as a vector. The covariant 
and contravariant components of this vector in some basis arc given by 

tie a) = t a and t(e a ) = f . 

We are used to thinking of a vector l as a geometrical object which can be made 
up from a linear combination of the basis vectors, 

t = t a e a = t a e a . (4.8) 

Tensors of higher rank arc generalisations of the concept of a vector and can 
also be regarded as geometrical entities. In a particular basis, a general tensor 
can expressed as a linear combination of a tensor basis made up from the basis 
vectors and their duals. 

Let us consider the outer product e a ®e b of two basis vectors of some coordinate 
system. The contravariant components of this rank-2 tensor in this basis arc very 
simple, 

(e a ®e b ){e c , e d ) = e a (e c )e b (e d ) = 8 c a 8 d b . 

Now suppose that we have some general rank-2 tensor t, whose contravariant 
components in our basis are t ab . Let us consider the quantity t ah (e a ®e h ). This is 
a sum of rank-2 tensors, which must therefore also be a rank-2 tensor (see above). 
If we consider its action on two basis vectors, we find 

t ab (e a <8> e b )(e c , e d ) = t ab 8 c a 8 d b = t cd - 

the t cd arc simply the contravariant components of t. Thus, in an analogous way 
to the vector in (4.8), we may express the rank-2 tensor t as a linear combination 
basis tensors, 


t = t ab {e a ®e h ). 
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By considering different tensor bases, constructed from other combinations of the 
basis and dual basis vectors, we can also write t in several different ways: 

t = t ab (e a ®e h ) = t b {e a ® e b ) = t a b {e a ®e h ). 

This idea is extended straightforwardly to higher-rank tensors. 


4.9 Tensors and coordinate transformations 


The description of tensors as a geometrical objects lends itself naturally to a 
discussion of the behaviour of tensor components under a coordinate transfor¬ 
mation x“ —> x! a on the manifold. As shown in Chapter 3, there is a simple 
relationship between the coordinate basis vectors e a associated with the coordi¬ 
nate system x a and the coordinate basis vectors e' a associated with a new system 
of coordinates x' a . We found that at any point P the two sets of coordinate basis 
vectors arc related by 


, dx b 

e„ =- e h , 

a dx' a b 


(4.9) 


where the partial derivative is evaluated at the point P. A similar relationship 
holds between the two sets of dual basis vectors: 


e 


!a 


dx 

— u e 

dx b 


(4.10) 


Using (4.9) and (4.10), we can now calculate how the components of any general 
tensor must transform under the coordinate transformation. 

As shown in Chapter 3, the contravariant components of a vector t in the new 
coordinate basis are given by 


■A Y 'a A r ta 

a?'- 


Similarly, the covariant components of t are given by 




dx b 

- h 

dx’ a b 


It is important to remember that the unprimed and primed components describe 
the same vector t in terms of different basis vectors, i.e. t = t“e a = t' a e' a . The 
vector t is a geometric entity that does not depend on the choice of coordinate 
system. 
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The transformation properties of the components of higher-rank tensors may 
be found in a similar way. For example, if t is a second-rank tensor then 



(4.11) 


Once again, these components describe the same tensor (which is a geometric 
entity) in terms of different bases. For example, 


t = f\e a ®e h ) = t' ab (e' a ®e' b ). 


In general, when transforming the components of a tensor of arbitrary rank, 
each superscript inherits a transformation ‘matrix’ dx' a /dx c and each subscript a 
transformation matrix dx c /dx' a . Thus, for example, 


8x d dx e dx' c f 
dx la dx ,b dxf de 


(4.12) 


Indeed, the basic requirement for a set of quantities to be the components of a 
tensor is that they transform in such a way under a change of coordinates. We 
shall return to this point later. 


4.10 Tensor equations 

Given a coordinate system (and hence a coordinate basis and its dual), it is 
convenient to work in terms of the components of a tensor t in this system rather 
than with the geometrical entity t itself. Therefore, from here onwards we shall 
adopt a much-used convention, which is to confuse a tensor with its components. 
This allows us to refer simply to the tensor t ab c , rather than the tensor with 
components t ab c . 

We now come to the reason why tensors arc important in mathematical physics. 
Let us illustrate this by way of an example. Suppose we find that in one particular 
coordinate system two tensors arc equal, for example, 


tab S a b- 


( 4 . 13 ) 
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Let us multiply both sides by dx a /dx' c and dx b /dx' d and take the implied summa¬ 
tions to obtain 

dx a dx b dx a dx b 
dx' c dx ,d ah dx' c dx ,d ab 

Since t ah and s ab arc both covariant components of tensors of rank 2, this equation 
can be restated as t’ ab = s’ ab . In other words, the equation (4.13) holds in any 
other coordinate system. In short, a tensor equation which holds in one coordi¬ 
nate system necessarily holds in all coordinate systems. Put another way, tensor 
equations arc coordinate independent, which is in fact obvious from the geomet¬ 
ric approach we have adopted since the outset. One particularly useful fact that 
emerges clearly from this discussion, and the transformation law (4.12), is that if 
all the components of a tensor are zero in one coordinate system then they vanish 
in all coordinate systems. This is useful in proving many tensor relations. 


4.11 The quotient theorem 

Not all objects with indices arc the components of a tensor. An important example 
is provided by the connection coefficients T a bc , which vanish in a locally Cartesian 
coordinate system but not in other coordinate systems. Moreover, in Chapter 3 
we derived the transformation properies of P a bc and found that these were not of 
the form (4.12). 

As mentioned above, the fundamental requirement that a set of quantities form 
the components of a tensor is that they obey a transformation law of the kind 
(4.12) under a change of coordinates. The quotient theorem provides a means of 
establishing this requirement in a particular case without having to demonstrate 
explicitly that the transformation law holds. It states that if a set of quantities 
when contracted with a tensor produces another tensor then the original set of 
quantities is also a tensor. Rather than give a general statement of the theorem 
and its proof, which tend to become obscured by a mass of indices, we shall give 
an example that illustrates the gist of the theorem. 

In an /V-dimcnsional manifold, suppose that with each system of coordinates 
about a point P there arc associated N 3 numbers t a bc and it is known that, for arbi¬ 
trary contravariant vector components v a , the N 2 numbers t a bc v c transform as the 
components of a rank-2 tensor at P under a change of coordinates. This means that 


t bc v' 


dx ,a dx e d f 
dx d dx ,b 


(4.14) 


where the t ,a bc arc the corresponding N 3 numbers associated with the primed 
coordinate system. Then we may deduce that the t a bc are the components of a 
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rank-3 tensor, as follows. Since vf = (dx d /dx' c )v' c , equation (4.14) yields 


t bcV 


d A v 'c 

dx d dx' b dx' c 


which, on rearrangement gives 


t u r 


dx ,a dx e dxf d 
dx d dx' b dx' c 


= 0 . 


This holds for arbitrary vector components v' c , so the expression in parentheses 
must vanish identically. Thus 

dx' a dx e dxf d 
hc dx d dx' b dx fc 

and therefore the t a bc must be the components of a third-rank tensor. 

Thus the gist of the quotient theorem is that if a set of numbers displays tensor 
characteristics when some of their indices are ‘killed off’ by summation with the 
components of an arbitrary tensor then the original numbers arc the components 
of a tensor. 


4.12 Co variant derivative of a tensor 

It is straightforward to show that in an arbitrary coordinate system (unlike in local 
Cartesian coordinates) the differentiation of the components of a general tensor, 
other than a scalar, with respect to the coordinates does not in general result in 
the components of another tensor. For example, consider the derivative of the 
contravariant components v a of a vector. Under a change of coordinates we have 

dv' a _ dx c dv ,a 
dx' b dx' b dx c 

_ dx c d fdx' a d \ 
dx' b dx c \ dx d ) 

dx c dx' a dv d dx c d 2 x ,a d /t 

— - u --I-z-7 V . (4.15) 

dx' b dx d du c dx' b dx c dx d 

The presence of the second term on the right-hand side of (4.15) shows that the 
derivatives dv a /dx b do not form the components of a second-order tensor. This 
term arises because the ‘transformation matrix’ \dx ,a /dx h \ changes with position 
in the manifold (this is not true in local Cartesian coordinates, for which the 
second term vanishes). 

To avoid this difficulty, in Chapter 3 we introduced the covariant derivative of 
a vector, 


V b v a = d b v a + r a cb v c , 
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in terms of which we may write d b v = (V b v a )e a . Using the transformation prop¬ 
erties of the connection, derived in Chapter 3, it is straightforward to show that 
the V b v a arc the (mixed) components of a rank-2 tensor, which is in fact clear 
from their definition. We denote this rank-2 tensor by Vv, which is formally the 
outer product of the vector differential operator V with the vector v, although it 
is usual to omit the symbol ® in outer products containing V. In a given basis we 
have V = e a d a , so we may write, for example, 


Vv = e a d a ® v b e h = e a ® d a (v b e b ) = (y a v b )e a ®e b . 


Similarly, the V b v a form the covariant components of this tensor, i.e. Vv = 
(y a v b) ea ®e b ■ Indeed, it is easy to check that V b v a and V b v a satisfy the required 
transformation laws for being the components of a tensor. 

We can extend the idea of the covariant derivative to higher-rank tensors. For 
example, let us consider an arbitrary rank-2 tensor t and derive the form of the 
covariant derivative V c t ab of its contravariant components. Expressing t in terms 
of its contravariant components, we have 

d c t = d c {t ab e a ®e b ) = (d c t ab ) e a ®e h + t ab (d c e a ) ®e b + t ab e a ® ( d c e b ). 

We can rewrite the derivatives of the basis vectors in terms of connection coeffi¬ 
cients to obtain 

d c t = (. d c t ab )e a <g> e b + t ab T d ac e d <g> e b + t ab e a <g> T d bc e d . 

Interchanging the dummy indices a and d in the second term on the right-hand 
side and b and d in the third term, this becomes 

dj — {dct ab + r a dc {db + r b dct ad ) e a ® e b > 

where the expression in parentheses is the required covariant derivative, 

v c t ab = d c t ab + r a dc t dh + T b dc t ad . (4.16) 

Using (4.16), the derivative of the tensor t with respect to x c can be written in 
terms of its contravariant components as 


d ct= (V c t ab )e a ®e b . 
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Si mi lar results may be obtained for the the covariant derivatives of the mixed 
and covariant components of the second-order tensor t. Collecting these results 
together, we have 


(4.17) 


The positions of the indices in these expressions are once again very systematic. 
The last index on each connection coefficient matches that on the covariant 
derivative, and the remaining indices can only be logically arranged in one way. 
For each contravariant index (superscript) on the left-hand side we add a term on 
the right-hand side containing a Christoffel symbol with a plus sign, and for every 
covariant index (subscript) we add a corresponding term with a minus sign. This 
is extended straightforwardly to tensors with an arbitrary number of contravariant 
and covariant indices. We note that the quantities V c t ab , V c t a b and V c t ab arc the 
components of the same third-order tensor Vi with respect to different tensor 
bases, i.e. 

Vi = (V c t ab )e c ®e a ®e b = (S/ c t a b )e c ® e a ®e b = (V c t ab )e c ®e a ®e b . 

One particularly important result is that the covariant derivative of the metric 
tensor g is identically zero at all points in a manifold, i.e. 



Alternatively, we can write this in terms of the components in any basis as 

(4.18) 

This result follows immediately from comparing, for example, the third result in 
(4.17) with our expression (3.20), derived in Chapter 3, for the partial derivative 
of the metric in terms of the affine connection. We note, in particular, that the 
expression (3.20) holds even in a manifold with non-zero torsion, and therefore 
so too must the result (4.18). 1 

The result (4.18) has an important consequence, which considerably simplifies 
tensor manipulations. This is that we can interchange the order of raising or 

1 In fact, for a general manifold with non-zero torsion, it is not necessary that (4.18) holds since one can, 
in principle, define the affine connection and the metric independently. In arriving at our earlier expression 
(3.20), we had in fact already assumed implicitly that the affine connection was metric-compatible , in which 
case (4.18) holds automatically. This topic is, however, beyond the scope of our discussion. 
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lowering an index and performing covariant differentiation without affecting the 
result. For example, consider the contravariant components t ah of some rank-2 
tensor. Using (4.18), we can write, for example, 

V c t ab = v c {g bd t a d ) = (V c8 bd )t a d +g hd (v c t a d ) = g bd (y c f d ). 

We also note that the covariant derivative obeys the standard rule for the differ¬ 
entiation of a product. 


4.13 Intrinsic derivative of a tensor along a curve 

In Chapter 3 we encountered vector fields that are defined only on some subspace 
of the manifold, an extreme example being when the vector field v(u) is defined 
only along some curve x a (u) in the manifold (as for the spin s(r) of a single 
particle along its worldline in spacetime). In a similar way a tensor field t(u) 
could be defined only along some curve G. We now consider how to calculate 
the derivative of such a tensor with respect to the parameter u along the curve. 

Let us begin by expressing the tensor at any point along the curve in terms of 
its contravariant components (say), 

t(u) = t ab {u)e a {u)®e b {u). 


where the e a (u) arc the coordinate basis vectors at the point on the curve corre¬ 
sponding to the parameter value u. Thus, the derivative of t along the curve C is 
given by 


dt 

du 


dt ab 

—~e a ®e b + t 


ab 


a 

du 


®e b + t ab e a ® 


de h 

du 


Using the chain rule to rewrite the derivatives of the basis vectors, we obtain 


dt 

du 


dt ah 

du 


)e h + t a 


, dx c de a 
du dx c 


)e h + t e n 


dx c de h 
du dx c 


Finally, by writing the partial derivatives of the basis vectors in terms of the 
connection and relabelling indices, we find that 


dt 

du 


dt ab n h dx^ h nd dx c 

+ r a dct db — + T b dct ad — \e a 


du 


du 


du 


(4.19) 


The term in brackets is called the intrinsic (or absolute ) derivative of the compo¬ 
nents t ab along the curve G and is denoted 


Dt ab 

Du 


dt ab 

du 


+ r a dc t db ^f + r, 

du 


b t ad 
dc l 


dx c 

du 
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Similar results may be obtained for the covariant and mixed components of the 
tensor t. For example, the derivative of t along the curve may be written 


dt 

du 


Dt Dt ab a b Dt b 

- e ® e b = — —e“ ® e = — -e a 

Du Du Du 


Clearly, the method can be extended easily to higher-rank tensors. 

In a si mi lar way to vectors, a tensor t is said to be parallel-transported along a 
curve G if dt/du = 0 or, equivalently, in terms of its components, if for example 
Dt ab /Du = 0. 

Following our discussion of the intrinsic derivative of a vector in Chapter 3, a 
convenient way to remember the form of the intrinsic derivative is to pretend that 
the tensor t is in fact defined throughout (some region of) the manifold, i.e. not 
only along the curve C. If this were the case then we could differentiate t with 
respect to the coordinates x a . Thus we could write 


dt ab _ dt ab dx c 
du dx c du 


Substituting this into (4.19), we could then factor out dx c / du and recognise the 
other factor as the covariant derivative V c t ab . Thus we could write 


Dt ab ab dx c 

-= V c t ab -, 

Du du 


(4.20) 


with similar expressions for the intrinsic derivatives of its other components. It 
must be remembered, however, that if t is only defined along the curve G then 
formally (4.20) is not defined and acts merely as an aide-memoire. 


Exercises 

4.1 If t is a rank-2 tensor, show that 

t(u + v, w+z) = t ab {u a + v a )(w h + Z b ). 

4.2 If s ab = s ba and t ab — —t ba are the component of a symmetric and an antisymmetric 
tensor respectively, show that s ab t ab = 0. 

4.3 If t ab are the components of an antisymmetric tensor and v a the components of a 
vector, show that 

V*d = | (Vfcc +Vai + VcJ- 

4.4 If t ah are the components of a symmetric tensor and v a the components of a vector, 
show that if 

Vjbc + vJab + = 0 

then either t ab = 0 or v a = 0. 
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4.5 If the tensor t abcd satisfies t abcd v a w b v c w d = 0 for arbitrary vectors v a and w a , show 
that 

iabed Cdab "f" Cbad 4“ 1 adeb 

4.6 Consider the infinitesimal coordinate transformation 

x’ a — x a + ev a (x), 

where v a (x) is a vector field and e is a small scalar quantity. Show that, to first 
order in e, 

g'ab( X ') = 8ab( X ) - t(gaAv C + g cb d a V°). 

4.7 By investigating their transformation properties, show that V h v" are the mixed 
components of a rank-2 tensor. 

4.8 If v a are the covariant components of a vector and A ab are the components of an 
antisymmetric rank-2 tensor, show that 

^a V b~\ V a = d a V b -d b V a , 

^a^-bc + ^Aab + V b A ca = 3 a A bc + d c A ab + d h A ca . 

Determine the symmetry properties of the rank-3 tensor 

B a bc — d a A-bc + dcA a b + d b A ca . 

4.9 Show that covariant differentiation obeys the usual product rule, e.g. 

V a (A bc B cd ) = (V a A bc )B cd + A bc (V a B cd ). 

Hint: Use local Cartesian coordinates. 

4.10 For a general rank-2 tensor T ab , show that the covariant divergence is given by 

V a T ab = ~^= d a (y/\g\ T ab ) + r b ca T ac . 
y/\g\ 

Show further that if A ab — —A ba are the components of an antisymmetric rank-2 
tensor then 

V a A ab =^=d a (J\i\A ab ). 

v\g\ 

Hence show that if the antisymmetric tensor field A ab vanishes on a hypersurface 
S that bounds a region V of an 77-dimensional manifold then 

[ {V a A ab )^gd N x= 0. 

Jy 

4.11 Any coordinate transformation x" -> x'“ under which the metric is form invariant, 
i.e. such that 


g'abi X ) = 8ab( x ) 
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is called an isometry (note that the argument is the same on both sides of the above 
equation). Show that the infinitesimal coordinate transformation in Exercise 4.6 is 
an isometry, to first order in e, provided that if satisfies 

8ac d bV c + gcb d a v c + v c d c g ab = 0. 

Show further that this expression can be written as 

^a V b+\ V a = 0 . 

This is Killing’s equation and any vector satisfying it is known as a Killing vector 
of the metric g ab . Show that if v“ and uf are both Killing vectors then so too is any 
linear combination A?/' + /iw", where A and j± are constants. 



5 

Special relativity revisited 


Now that we have the machinery of tensor calculus in place, let us return to special 
relativity and consider how to express this theory in a more formal manner. 


5.1 Minkowski spacetime in Cartesian coordinates 

In the language of Chapter 2, the Minkowski spacetime of special relativity 
is a fixed four-dimensional pseudo-Euclidean manifold. As such, there exists a 
privileged class of Cartesian coordinate systems (t, x, y, z) covering the whole 
manifold, so that at every point (or event) the squared line element takes the form 

ds 2 = c 2 dr 2 = c 2 dt 2 — dx 2 — dy 2 — dz 2 , 

where we have taken the opportunity to define the proper time interval dr 2 = 
ds 2 /c 2 . It is convenient to introduce the indexed coordinates x !± (f± = 0, 1,2, 3), 1 
so that 

o 1 1 t 

x = ct, x = x, x = y, x = z, 

and to write the line element as 


ds 2 = v dx M dx v , 


1 It is conventional to use Greek indices when discussing four-dimensional spacetimes rather than the Latin 
indices a, b,c etc. from the start of the alphabet, which are used for abstract A-dimensional manifolds. 
Moreover, in relativity theory, it is more common for a Greek index to run from 0 to 3 than from 1 to 4 
(although the latter usage is found in some textbooks). Also, it is conventional to use Latin letters from the 
middle of the alphabet, such as i, j, k etc., for indices that run from 1 to 3. 


Ill 
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where the 7] jXV are the covariant components of the metric tensor and arc 
given by 


/I 

q 

VO 


0 

-1 

0 

0 


0 

0 

-1 

0 


0 \ 

0 

0 


- 1 J 


(5.1) 


From now on we will often use the shorthand notation [ 17 ^,,] = diag(l, —1, —1, —1). 
It is clear that the contravariant components of the metric are identical, i.e. 
[rf ,v ] — diag(l, — 1, — 1, — 1). With this definition of the metric, Minkowski 
spacetime has a signature of —2 . 2 We also note that, since the metric coefficients 
are constant, the connection T ll va vanishes everywhere in this coordinate system. 


5.2 Lorentz transformations 


Cartesian coordinates, which we arc using in the context of special relativity, have 
a direct physical interpretation and correspond to distances and times measured 
by an observer at rest in some inertial frame S that is labelled using three- 
dimensional Cartesian coordinates 3 (remember that, in Chapter 1, we defined 
an inertial frame as one in which a free particle moves in a straight line with 
fixed speed). Transforming to a different Cartesian inertial frame corresponds to 
performing a coordinate transformation on the Minkowski spacetime to a new 
system x' M . Since we require that the new coordinate system x' 11 also corresponds 
to a Cartesian inertial frame, the (squared) line element ds 1 must take the same 
form in these primed coordinates as it did in the unprimed coordinates, i.e. 

ds 2 = 17 ^ dx^ dx v = r]^ dx'^ dx' v . 


In other words the metric in the new coordinates must also be given by (5.1). 
From the transformation properties of a second-rank tensor, this means that the 
transformation x 11 —»• x' M must satisfy 


_ dx'P dx ,CT 

ri » v ~lwlw rip(T ' 


(5.2) 


which is the necessary and sufficient condition that a transformation x ,A —* x' 11 
is a Lorentz transformation between two Cartesian inertial coordinate systems. 
From (5.2), we see that the elements of the transformation matrix must be 


Note that some relativists use an alternative, but equivalent, definition [ 17 ^] = diag(—1,1,1, 1) in which the 
signature is + 2 . 

We shall prove this shortly. 




5.3 Cartesian basis vectors 
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constants. Thus the transformation between two inertial coordinate systems must 
be linear, i.e. 




(5.3) 


where the and a v arc constants. This has the form of a general inhomogeneous 
Lorentz transformation (or Poincare transformation). We will generally take the 
(unimportant) constants a 11 to be zero, in which case (5.3) reduces to a normal, 
homogeneous, Lorentz transformation. As discussed in Chapter 1, the constants 
A fJ ' v in the transformation matrix depend upon the relative speed and orientation 
of the two inertial frames. If the unprimed and primed coordinates correspond 
to inertial frames S and S' in standard configuration, with S' moving at a speed 
ii relative to S, then the transformation matrix can be written in two equivalent 
forms: 


[AM = 


dx' 11 
~dx V 


( y -Py 0 0\ 

—/3y y 0 0 

0 0 10 

V 0 0 0 1/ 


( cosh * jj — sinh f 0 0\ 
— sinh t/r cosh f 0 0 

0 0 10 

V o 001/ 


(5.4) 


where /3 = v/c, y — (1 — /3 2 ) -1 / 2 and the rapidity is defined by if/ = tanh -1 /3. 
Clearly, if the axes of S' and S are rotated with respect to one another then the 
transformation is more complicated. 

The transformation inverse to (5.4) is clearly obtained by putting v —> — v (or 
equivalently if/ —»■ —if). In general, the inverse transformation matrix is denoted by 


[A/] = 


' dx v ' 
dx' 11 


and may be calculated from the forward transform using the index-raising and 
index-lowering properties of the metric, i.e. 



That this is indeed the required inverse may be shown using the condition (5.2), 
which gives 


= A^TLpP^A^ = r, j ; , T rf T = 8° v 


5.3 Cartesian basis vectors 

Figure 5.1 shows the coordinate curves for two systems of coordinates x a and 
x' a , corresponding to Cartesian inertial frames S and S' in standard configuration 
(with the 2- and 3- directions suppressed). In any coordinate system the coordinate 
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Figure 5.1 The coordinate curves (dotted lines) for two systems of coordinates 
x a and x'“, corresponding to Cartesian inertial frames S and S' in standard 
configuration. The coordinate basis vectors for each system are also shown. The 
2- and 3-directions are suppressed, and null vectors would lie at 45 degrees to 
the vertical axis. 

basis vectors are tangents to the coordinate curves; these are shown for S and S' 
in Figure 5.1 (in this diagram, null vectors would lie at 45 degrees to the vertical 
axis). In general, the two sets of basis vectors are related by 


= V ' e v and = A 

which tells us how to draw one set of basis vectors in terms of the other set. 

The two sets of basis vectors satisfy 

e li ' e v = e fji ' e v = VflV’ 

and so both sets form an orthonormal basis at each point in the pseudo-Euclidean 
Minkowski spacetime. As drawn in Figure 5.1 the vectors e p appeal - mutually 
perpendicular, but the e'^ do not. This is an artefact of representing a pseudo- 
Euclidean space on a Euclidean piece of paper. As we shall see, the notion of an 
orthonormal set of basis vectors at any point in the spacetime is of fundamental 
importance for our description of observers. 

We can also define dual basis vectors for each system as 

e M = rf v e v and e'^ = rf ,v e' v . 

These vectors also form orthonormal sets, since 


e = e ^ ■ e = 


and the components -q 11 ” are identical to the components -q IAlJ . 































5.4 Four-vectors and the lightcone 

5.4 Four-vectors and the lightcone 
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As in any manifold, we can define vectors at any point P in Minkowski spacetime 
(and thus vector fields). 4 In relativity, vectors defined on a four-dimensional 
spacetime manifold arc called 4-vectors. These 4-vectors arc geometrical entities 
in spacetime, which can be defined without any reference to a basis (or coordinate 
system). Nevertheless, in a particular coordinate system, we can write a general 
4-vector v at P in terms of the coordinate basis vectors at P: 



Let us assume for the moment that we arc using Cartesian coordinates 
corresponding to some inertial frame S. At each point P in spacetime we have a 
constant set of orthonormal basis vectors e . The square of the length of a vector 
v at a point P (which is a coordinate-independent quantity) is then given by 

v-v = v^vP = Vij. v v / 1 v v . 


We have that 

for tj i/V > 0 the 
for i 7 ^„ v^v v = 0 the 
for 7) llv v IJ 'v v < 0 the 


vector is timelike', 

(5.5) 

vector is null'. 

(5.6) 

vector is spacelike. 

(5.7) 


«0 



future-pointing timelike vector 
future-pointing null vector 
spacelike vector 


*2 


past-pointing timelike vector 


Figure 5.2 The lightcone at some point P in Minkowski spacetime (with one 
spatial dimension suppressed). 

4 In fact, since Minkowski spacetime is pseudo-Euclidean, the tangent space T P at any point P coincides with 
the manifold itelf. Thus, in this special case, we are not restricted to local vectors and can reinstate the notions 
of position vector and of the displacement vector between arbitrary points in the manifold. 
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Thus, as we would expect, the coordinate basis vector e 0 , which has components 
(1, 0, 0, 0), is timelike. Similarly, the basis vectors efi = 1,2,3) arc spacelike. 
Moreover, for any timelike or null vector v, if v -e 0 > 0 then v is called future- 
pointing whereas if v • e 0 < 0 then v is past-pointing. 

At any point P in the Minkowski spacetime, the set of all null vectors at P 
forms the lightcone or null-cone. The structure of the lightcone is illustrated in 
Figure 5.2, with one spatial dimension suppressed. 


5.5 Four-vectors and Lorentz transformations 

Suppose that the Cartesian coordinates x ,A and x’ >x correspond to inertial frames 
S and S'. Thus, at each point P in the Minkowski spacetime we have two sets of 
(constant) basis vectors and e'^, and a general 4-vector v defined at P can be 
expressed in terms of either set: 


v = tre„ = v ^e 


A' 


Thus, the components in the two bases arc related by 


v'^=ve' fX = AP v v v 
v^ = v-e lx = A v fX v'\ 


(5.8) 


where is the Lorentz transformation linking the coordinates and x' 11 . Let 
us now consider some examples of physical 4-vectors and investigate the physical 
consequences of these transformations. 


5.6 Four-velocity 

A particularly important 4-vector is the 4-velocity of a (massive) particle (or 
observer). As discussed in Chapter 1, the trajectory of a particle describes a curve 
G or worldline in spacetime. We could parameterise this curve in any way we 
wish, but for massive particles it is usual to parameterise it using the proper time 
t measured by the particle. The 4-velocity u of the particle at any event is then 
the tangent vector to the worldline at that event. For a massive particle, u is a 
future-pointing timelike vector. The length of this tangent vector (which is defined 
independently of any coordinate system) is constant along the worldline, since 
(as shown in Chapter 3) 



(5.9) 
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r° 



Figure 5.3 The 4-velocity at events along the worldlines of a particle travelling 
at uniform speed in S (solid line) and a particle accelerating with respect to S 
(broken line). 


Since t is proportional to the interval s along the worldline, it is an affine parameter 
(see Chapter 3). 

Suppose that we label spacetime with some Cartesian coordinate system corre¬ 
sponding to an inertial frame S. We can then write the worldline of a particle 
in this coordinate system as = x^{t). Figure 5.3 shows the 4-velocity at two 
events on the worldline of a particle moving at uniform velocity in the frame S. 
In this case the direction of the 4-velocity is also constant along the worldline. 
The figure also shows the 4-velocity at two events on the worldline of a particle 
that is accelerating (back and forth) with respect to the frame S. Clearly, in this 
case, the direction of the 4-velocity changes along the worldline. 

The (contravariant) components of the 4-velocity in the frame S arc given by 

dx^ 

uf h = u-e‘ h = —. (5.10) 

dr 

Setting x° = ct for the moment, and noting that dt = dt/y u , we can write these 
components as 


r , dx l dx 2 dx 3 . , „. 


(5.11) 


where in the last line (with a slight abuse of notation) we have introduced the rela¬ 
tive 3-vector u — (u l , u 2 , u 3 ), which is the familial - (three-dimensional) velocity 
vector of the particle as measured by an observer at rest in S. 
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In some other inertial frame S', the components of the 4-velocity of the parti¬ 
cle are 


u'^ =u-e'^ = A^ v u v . 

Writing this out in full for the case where S and S' are in standard configuration 
with relative speed v, we obtain 


< y u’C ' 


y u ,ii n 


a 

— 

y u r u 


1 A/ . 1i f ^ J 



Jv 

-pJv 

0 

0 


-P y v 0 0\ 

Jv 00 
0 1 0 
0 0 1 / 


This is equivalent to four equations. From the first, we 


( y „c> 
y u u l 
y u u 2 
\ 7„«7 
find that 


y u _ 1 1 

y u , y w 1 — u l v/c 2 ' 


and from the others we obtain the 3-velocity addition law in special relativity. 


(1 — u l v/c 2 ) ’ 


y„(l — u x v/c 2 ) ’ 


y v (l — iPv/c 2 ) 


Note that this approach has allowed us to derive the 3-velocity addition law in an 
almost trivial way. 


5.7 Four-momentum of a massive particle 

The 4-momentum of a massive particle of rest mass m () is defined in terms of its 
four-velocity u by 

p = m 0 M. 

At any point P along the particle’s worldline the square of the length of this 
vector is 

p ■ p = m 0 u ■ m 0 u = m^c 2 . (5-12) 

In Cartesian coordinates x 11 corresponding to some inertial frame S, the compo¬ 
nents of the 4-momentum are simply —p-e^. According to convention we write 

[p^ = {E/c,p l ,p 1 ,p i ) = (E/c,p), 


(5.13) 




5.8 Four-momentum of a photon 
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where E is the energy of the particle as measured in the frame S and p is its 
3-momentum measured in S. Comparing (5.13) with (5.11) we see that, in special 
relativity, 

E = y u m 0 c 2 , (5-14) 

p = y u m 0 u. (5.15) 

In the frame S, the squared length of the 4-momentum is given by p 11 p fJ . Thus, 
from (5.13) and (5.12), we find that 

r 2 2 2 2 4 

E — p c = m 0 c , 

where p 2 = p-p. This is the well-known energy-momentum invariant. 


5.8 Four-momentum of a photon 

The above discussion concerned particles of non-zero rest mass, which move 
at speeds less than c. We now consider particles such as photons and perhaps 
neutrinos, which move at the speed of light. The worldline of a massless particle is 
a null curve, along which dr = 0. Thus, we cannot parameterise such a worldline 
using the proper time r. Nevertheless, there arc many other parameters that we 
can use. For example, in an inertial frame, a photon travelling in the positive 
x-direction will describe the path x = cl. This could be written parametrically as 

x fJj = u fJj a, (5.16) 


where a is the parameter and [ id 1 ] = (1,1,0, 0). Using (3.43), the tangent vector 
to the worldline is then 


dx^ 

u =- e 

da 


u 


= u^e 


w 


Since the worldline is a null curve, we have 


uu — 0 , 


(5.17) 


in contrast with (5.9). Moreover, with this choice of parameter A we see that 



(5.18) 


which is the equation of motion for a photon. We note that although this has 
been derived using the fact that the Cartesian basis vectors <? yj do not change 
with position, it is a vector equation and therefore will hold in any basis (i.e. any 
coordinate system). 
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Our choice of parameterisation in (5.16) may appeal - somewhat arbitrary. 
Indeed, it is true that there exists an unlimited number of parameterisations that 
could be used. For example, suppose that we replaced a by a 2 (say). As the new 
parameter a varies between — oo and oo, the same worldline x = ct would be 
described in the spacetime. Since this is a null curve, the condition (5.17) would 
continue to be true (as may be verified explicitly). In the new parameterisation, 
however, the equation of motion (5.18) would not still hold. The special class of 
parameters for which the equation of motion has the simple form (5.18) is the 
class of affine parameters (as discussed in Section 3.16). Since one is always free 
to choose such a parameter, we will assume from here on that equation (5.18) is 
satisfied. 

So far, we have not mentioned the frequency (or energy) of the photon, which 
characterises it in much the same way as the rest mass m Q characterises a massive 
particle. Clearly, the tangent vector u can be multipled by any scalar constant and 
will still satisfy the equations (5.17) and (5.18). The 4-momentum of a photon is 
therefore defined as 

p = an, 

for a constant a chosen such that, in an arbitrary inertial frame S, the components 
of p are 

\P IX ] = ( E/c , p), 

where E is the energy of the photon as measured in S and p is its 3-momentum 
in S. From (5.17) we thus have E = pc. 

For photons, it is also common to introduce the 4-wavevector k, which is related 
to the four-momentum by p = hk. Thus, in the frame S, the 4-wavevector has 
components given by 

[ffi] = (2TT/\,k), 

where A is the wavelength of the photon as measured in S and k = {lTt/\)n and 
n is a unit 3-vector in the direction of propagation. 


5.9 The Doppler effect and relativistic aberration 

An example of the usefulness of the 4-vector approach (and particularly the photon 
4-wavevector) is provided by the Doppler effect. Suppose that an observer 0 is at 
rest in some Cartesian inertial frame S defined by the coordinates x 11 in spacetime. 
Let us also suppose that a source of radiation is moving relative to S with a speed 
v in the positive x 1 -direction and that at some event P the observer receives a 
photon of wavelength A in a direction that makes an angle 6 with the positive 
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x 1 -direction. Thus, at the event P the components k 11 = k - e' 1 of the photon’s 
4-wavevector in this coordinate system are 

277 

[ k = —(1, cos 0, sin 0, 0). 

A 

The photon observed at the event P must have been emitted by the source at some 
other event Q (say). However, the equation of motion of a photon implies that 
its 4-momentum p, and hence its 4-wavevector k , is constant along its worldline. 
Thus the photon’s 4-wavevector k at the event Q is the same as that at the 
event P. 

Let us denote the Cartesian inertial frame in which the radiation source is at 
rest by S' (whose spatial axes are assumed not to be rotated with respect to those 
of Sy, this frame is represented by the coordinates x' 11 in spacetime. Thus, at 
the event Q the components in S' of the photon’s 4-wavevector are given by 
k'^ = k- e' 11 and read 


k' IJ - = Af l v k v , 


(5.19) 


where [A M „] is given by (5.4). 

We denote these components in S' by 

[^] = ^(l,cos0',sin0\O). 

A 

The zeroth component of (5.19) yields the ratio of the proper wavelength and the 
observed wavelength: 


y 

— = y(l —/3cos 0). 
A 


This equation contains all the familial - Doppler effect results as special cases. If 
0 = 0, the source must be approaching the observer along the negative x -1 -axis. 
If d = 77, the source is receding from the observer along the positive x 1 -axis. 
Finally, if 6 = ±tt/2 we obtain the transverse Doppler effect. Similarly, from the 
2- and 3- components of (5.19) we obtain immediately 


tan 6' = 


tan 9 

y[l — ( v/c) sec 0] ’ 


which is a version of the relativistic aberration formula. 
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5.10 Relativistic mechanics 

In relativistic mechanics, the equation of motion of a massive particle is given by 



where / is the 4-force. In some Cartesian inertial frame S (for which the basis 
vectors are constant throughout the spacetime) the components f 11 of the 4-force 
arc given by the familial - expression 


f» = e^f- = e^^-(p v e p ) 

aT dr 


dp v ^ _ d P IJ ' 
dr v dr 


where we have used the fact that e 11 and <? ;J are reciprocal sets of vectors. Noting 
that dr = dt/y u , we may write 




where in the last equality we have introduced the familial - 3-force f as measured 
in the frame S, and u is the 3-velocity in this frame. Writing the compo¬ 
nents in this way, the time and space parts of the equation of motion in S are 
(as required) 


1 dE 
y u dr 
1 dp 
y u dr 


dE 

dt 

dp 

dt 


= /■«. 

(5.20) 

=f. 

(5.21) 


where E and p are given by (5.14) and (5.15) respectively. 

There is, however, a certain rarely discussed subtlety in relativistic mechan¬ 
ics. Let us consider the scalar product u •/, which is of course invariant under 
coordinate transformations. This is given by 


, dp 
ut — u • — — u- 
dr 


dm c 
dr 


u + m 



= c 


2 


= C 


2 


dm () 

— -h m 0 u ■ 

dj 

dm Q 

dr 


du 

dr 


where we have (twice) used the fact that u u — c 2 . Thus, we see that in special 
relativity the action of a force can alter the rest mass of a particle! A force that 
preserves the rest mass is called a pure force and must satisfy u f = 0. 
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If so desired, one can also introduce the 4-acceleration of a particle, a = du/dr, 
in terms of which a pure 4-force takes the familial - form / = m () a. In some 
Cartesian inertial frame S, the components of the 4-acceleration are 


du lx 


dr 


d _ ( dy u du „ dy u 

= Ju-riyuC, y u u) = y u c—, y u -r: + u— 

dt \ dt dt dt 


= y u 


dy u _ _ dy u 

c ——, y u a + u—— 
dt dt 


where a = du/dt is the 3-acceleration in the frame S. 


5.11 Free particles 

We now come to a very important observation concerning relativistic mechanics. 
In the absence of any forces, the equation of motion of a massive particle is 



(5.22) 


where the proper time t is an affine parameter along the particle’s worldline. 
Similarly, the equation of motion of a photon is 



(5.23) 


where a is some affine parameter along the photon’s worldline. However, in each 
case the 4-momentum p at some point on the worldline is simply a fixed multiple 
of the tangent vector to the worldline at that point. Thus, equations (5.22) and 

(5.23) say that tangent vectors to the worldlines of free particles and of photons 
form a parallel field of vectors along the worldline. From Chapter 2 we know 
that this is the definition of an affinely parameterised geodesic. Thus, in special 
relativity the worldlines of free particles and photons are respectively non-null 
and null geodesics in Minkowski spacetime. 


5.12 Relativistic collisions and Compton scattering 

We note from (5.22) and (5.23) that the conservation of energy and momentum 
for a free particle or photon is represented by the single equation p = constant. 
We can, of course, add the 4-momenta of different particles. Thus for a system of 
n interacting particles i = 1 , 2 ,...,/; with no external forces, we have fffi=\Pi = 
constant, which is very useful in relativistic-collision calculations. 
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Figure 5.4 The Compton effect. 


An important example of a relativistic collision is Compton scattering, in which 
a photon of 4-momentum p collides with an electron of 4-momentum q. It is 
easiest to consider the collision in the inertial frame S in which the electron is at 
rest and the photon is travelling along the positive x'-axis (see Figure 5.4). Thus 
the components of p and q in S are 

[p M ] = (hv/c, hv/c, 0, 0), 

[r/^] = ( m e c , 0, 0, 0), 

where v is the frequency of the photon as measured by a stationary observer in 
S, and m e is the rest mass of the electron. Let us assume that, after the collision, 
the electron and photon have 4-momenta p and q such that they move off in the 
plane x 3 = 0, making angles 6 and <fi respectively with the x 1 -axis. Thus 

[p = {hv/c, (hv/c) cos 8, (hv/c) sin 8, 0), 

[q M ] = (y u m e c, y u m e iicos4>, — y u m e sin 4>, 0), 

where u is the electron’s speed and v is the photon frequency as measured by 
a stationary observer in S after the collision. Conservation of total 4-momentum 
means that 

p v + q v = pV + - q ^ 

hv/c + m e c = hv/c + y u m e c, 

hv/c = (hv/c) cos 6 + y ll m e ucos cj), 

0 = (hv/c) sin 6 — y u m e ti sin cf>. 


which gives 


(5.24) 

(5.25) 

(5.26) 
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Eliminating u and f from these equations leads to the formula for Compton 
scattering, which gives the frequency of the photon in S after the collision: 


v = v 


1 + 


hv 

--(1 - cost?) 

m e c- 


The components of the 4-momentum p (or q) in any other inertial frame S' 
can be found easily by using p' ,x = A. fX v p v , where arc the elements of the 
Lorentz transformation matrix connecting the frames S and S'. 


5.13 Accelerating observers 

So far we have only considered inertial observers, who move at uniform speeds 
with respect to one another. Let us now consider a general observer O, who 
may be accelerating with respect to some inertial frame S. If the observer has a 
4-velocity h(t), where t is the proper time measured along the worldline, then 
his 4-acceleration is given by 


It is worth noting that, at any given event P, the 4-acceleration a is always 
orthogonal to the corresponding 4-velocity u, since 

au — — ( luu) = — (^c 2 )=0. (5-27) 

dr z dr 

An accelerating observer has no inertial frame in which he or she is always at 
rest. Nevertheless, at any event P along the worldline we can define an instan¬ 
taneous rest frame S’, in which the observer 0 is momentarily at rest. Since 
the observer is at rest in S', the timelike basis vector e' 0 of this frame must be 
parallel to the 4-velocity u of the observer. The remaining spacelike basis vectors 
e' t (i— 1, 2, 3) of S' are all orthogonal to e' Q and to one another and will depend on 
the relative velocity of S and S' and the relative orientation of their spatial axes. 
Observations made by 0 at the event P thus correspond to measurements made 
in the instantaneous rest frame (IRF) S' at P. This is illustrated in Figure 5.5. 

Thus, the notion of a localised laboratory can be idealised as follows. An 
observer (whether accelerating or not) carries along four orthogonal unit vectors 
<?^(t) (or tetrad ), which vary along his worldline but always satisfy 


e 'ii( T ) -e' v (T) = 


( 5 . 28 ) 
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Figure 5.5 The basis vectors e' 0 , e\ at the event P in the instantaneous rest frame 
S' of an observer 0 who is accelerating with respect to the inertial frame S. 


In particular, the timelike unit vector is given by 

*'o (t) = b(t). 


(5.29) 


where m(t) is the normalised 4-velocity of the observer and is simply u{f)/c. At 
any event P along the observer’s worldline, the tetrad comprises the basis vectors 
of the Cartesian IRF at the event P and defines a time direction and three space 
directions to which the observer will refer all measurements. Thus, the results of 
any measurement made by the observer at the event P are given by projections 
of physical quantities (i.e. vectors and tensors) onto these tetrad vectors. 

An important example occurs when the worldline of the observer intersects the 
worldline of some particle at the event P (at which we take the observer’s proper 
time to be t). If p is the 4-momentum of the particle at this event then the energy 
E' of the particle as measured by the observer is given by 

E' 

-=P-e q(t) =» E'=p-u{t). 

c 

Similarly, the covariant components // of the spatial momentum of the particle 
as measured by the observer arc given by 




Another example is provided by the 4-acceleration a. Since at any event P on the 
worldline we have e ' 0 = u, the orthogonality condition (5.27) and the fact that in 
the IRF [u 11 ] = (c, 0) imply that the components of the 4-acceleration in the IRF 
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are [ a 11 \ = (0, a'). Thus the magnitude of the 3-acceleration in the IRF can be 
computed as the simple invariant a a. 

It is interesting to consider how the tetrad of basis vectors changes along 
the worldline of an observer whose acceleration varies arbitrarily with time. 
As it is transported along the observer’s worldline, the tetrad must satisfy the 
two requirements (5.28) and (5.29). Clearly, given u(t ) the condition (5.29) 
determines the timelike basis vector e' 0 (r) uniquely. Unfortunately, condition 
(5.28) is obviously insufficient to determine uniquely the evolution of the spacelike 
basis vectors c'(t)(/ = 1,2,3), which reflect the different ways in which the 
observer’s local laboratory might be spinning and tumbling. An important special 
case, however, is when the tetrad is ‘non-rotating’. 

This last requirement requires some clarification. Clearly, the basis vectors of 
the tetrad at any proper time t arc related to the basis vectors <? ;J of some given 
inertial frame by the Lorentz transformation 

e '^ T ) = A/(t)<v 

Thus the tetrad basis vectors at two successive instants must also be related to each 
other by a Lorentz transformation, which can be thought of as a ‘rotation’ in space- 
time. A ‘non-rotating’ tetrad is one where the basis vectors e'^ (t) change from instant 
to instant by precisely the amount implied by the rate of change of u but with no 
additional rotation. In other words, we accept the inevitable rotation in the timelike 
plane defined by u and a but rule out any ordinary rotation of the 3-space vectors. 

Since we wish to treat the time and space directions on an equal footing, we must 
seek a general expression for the rate of change de'^/dr of a basis vector along 
the worldline such that: (i) it generates the appropriate Lorentz transformation if 
e ' lies in the timelike plane defined by u and a, and (ii) it excludes any rotation 
if e' lies in any other plane, in particular any spacelike plane. A little reflection 
shows that the unique answer to these requirements is 


de ',, 

dr 



ii-e'^a-ia-e'^u]. 


(5.30) 


Any vector that undergoes the above transformation is said to be Fermi-Walker 
transported along the worldline. From (5.30), we find that if e' is orthogonal to 
both u and a then de'^/dr = 0 as required. Moreover, we see that de^/dr = a/c, 
again as required. 

A physical example of a 3-space vector that does not rotate along the worldline 
is the spin (i.e. the angular momentum vector) of a gyroscope that the observer 
accelerates with himself by means of forces applied to its centre of mass (so that 
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there are no torques). Indeed, a careful observer could set up a non-rotating tetrad 
by aligning his three spatial axes using such gyroscopes. 


5.14 Minkowski spacetime in arbitrary coordinates 

There is no need to label events in Minkowski spacetime with the Cartesian 
inertial coordinates we have used thus far. The advantage of Cartesian coordinates 
X IX , which put the line element into the form 5 


ds 2 = dX^dX v (5.31) 

(even just at a particular event P), is that they have a clear physical meaning, i.e. 
they correspond to time and distances measured by an observer at P who is at rest 
in some inertial frame S labelled using three-dimensional Cartesian coordinates 
(we will prove this below). Nevertheless, we arc free to label events in spacetime 
using any arbitrary system of coordinates although, in general, the coordinates 
in such an arbitrary system may not have simple physical meanings. 

Since the path of a free massive particle is a geodesic in Minkowski spacetime, 
its worldline x /i (r) in some arbitrary coordinate system is given by the geodesic 
equations 


drx M dx v dx a 

dr 2 V<T dr dr 


(5.32) 


An inertial frame S is defined as one in which a free particle moves in a straight 
line with fixed speed. Thus from (5.31) it is clear that coordinates X 11 , such that 
(5.31) holds, define an inertial frame. In this case, the connection vanishes, 
and so the worldline of a particle is given by 


d 2 X ** 
dr 2 


= 0 . 


(5.33) 


Setting = (cT. X, Y, Z) for the moment, the ji = 0 equation (5.33) shows 
that dT/dr = constant. Thus the /jl = 1, 2, 3 equations read 

d 2 X _ d 2 Y _ drZ _ 

'dT 2 ~ dT 2 ~ dT 2 ~ ’ 


from which we see immediately that a free particle moves in a straight line with 
constant speed in S. 

We could label the inertial frame S using three-dimensional spatial coordi¬ 
nates that arc not Cartesian, however. For example, we could use spherical polar 


5 In the interest of clarity, in this section we will denote Cartesian inertial coordinates by X ,L and an arbitrary 
coordinate system by x^. 
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coordinates. This would correspond to making a change of variables in Minkowski 
spacetime to the new system [x M ] = (ct, r, 9, (f>), where 

T — t, X = r sin 6cos tj), Y = r sin 0sin 4>, Z = rcos</>. 

In this case, the line element becomes 

ds 2 = c dt 2 — dr 2 — r 2 dd 2 — r 2 sin 2 6 dcf) 2 , 

so the metric is [g ] = diag(l, —1, — r 2 , —r 2 sin 2 0). From the metric we can 
show that the non-vanishing components of the connection in this coordinate 
system are (with c = 1) 

r 1 22 = ~r, r 1 3 3 = rsin 2 0, 

T 2 12 = 1/r, r 2 33 = — sin 0 cos 0, 

r 3 i3 =1 /F r 1 22 =cot0. 

Thus, from (5.32), the geodesic equations for the worldline x m (t) of a free particle 
are very complicated in these coordinates (exercise), in spite of the fact that, to 
an observer with fixed (r, 9, <b) coordinates (i.e. at rest in S), a free particle still 
moves in a straight line with fixed speed. 

Alternatively, we could use three-dimensional Cartesian coordinates to label 
points in a non-inertial frame S' that is accelerating with respect to S. As an 
example, consider transforming from [ A M ] = (cT, X, Y, Z ) to a new system of 
coordinates [x M ] = (ct, x, y, z), where t, x, y, z arc defined by the equations 6 

T — t, X = xcos cot — y sin cot, Y — xsinwt + ycoswt, Z = z. 

Thus points with constant x, y, z values (i.e. the values arc fixed in S') rotate 
with angular speed to about the Z-axis of S (see Figure 5.6). Substituting these 
definitions into (5.31), the line element becomes 

ds 2 = [c 2 — or (x 2 + y 2 )]dt 2 + 2 wydtdx — Icoxdtdy — dx 2 — dy 2 — dz 2 , 

and the geodesic equations (5.32) arc (exercise) 

if = 0, 

x — co 2 xt 2 — 2 coyt = 0, 
y — co 2 yt 2 + 2a)xt = 0, 
z — 0, 


For a full discussion, see for example J. Foster & J. D. Nightingale, A Short Course in General Relativity, 
Springer-Verlag, 1995. 
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Figure 5.6 The coordinate system (x, y, z ) rotating relative to the inertial coor¬ 
dinate system ( X , Y, Z ). 


where the dots denote differentiation with respect to proper time r. These equa¬ 
tions give the worldline x 11 ( t) of a free particle in this coordinate system. Once 
again, the first equation implies that dt/dT = constant, so that we can replace 
the dots in the remaining three equations with derivatives with respect to t. 
Multiplying through by the rest mass m of the particle and rearranging, these 
equations become 





2 . dy 

mco~x + 2mco —, 
dt 

2 dx 

ma) y — 2mto —, 
dt 


0 , 


or, in 3-vector notation. 


x rl x 

m—— = —md) x (d> x x) — 2mci) x —, 
dt 2 ’ dt 


(5.34) 


where x = (x, y, z) and cb = (0, 0, cu). Thus we recover the equation of motion 
for a free particle in a rotating frame of reference. We note, however, that the 
coordinate t is the time measured by clocks at rest in the non-rotating system 5, 
since we have set t = T. It is possible to rewrite the equation of motion in terms of 
the proper time measured by an observer at some some fixed position in S', but to 
do so would involve replacing (5.34) by a more complicated equation that tends to 
conceal the Coriolis and centrifugal forces. Note that t is exactly the proper time 
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for an observer situated at the common origin O of the two systems, so observers 
close to O who arc at rest in S' would accept (5.34) as (approximately) valid. 

From these examples, we see that in general the geodesic equations can be 
rather complicated both for non-inertial frames and for inertial frames labelled 
by non-Cartesian spatial coordinates. Thus, when describing physical effects in 
an inertial frame, it is conventional to use Cartesian spatial coordinates to label 
points in the frame and so to work in a coordinate system X M for which (5.31) is 
valid. It is then much easier to disentangle the physical effects from artefacts of 
the coordinate system. 


Exercises 


5.1 Show that the transformation matrix for a Lorentz transformation from S to S' in 
standard configuration is given by (5.4). 

5.2 Show that, under a Lorentz transformation, the covariant components of a vector 
transform as i/ — A^'Ty. Hence show explicitly in component form that, for two 
4-vectors v and w, the scalar product v ■ is invariant under a Lorentz transformation. 

5.3 Prove that, for any timelike vector v in Minkowski space, there exists an inertial 
frame in which the spatial components are zero. 

5.4 Prove (a) that the sum of any two spacelike vectors is spacelike; and (b) that a 
timelike vector and a null vector cannot be orthogonal. 

5.5 For the spaceship discussed in Section 1.14, which maintains a uniform acceleration 
a in the x-direction of some inertial frame 5, the worldline is given by 

c aT c~ / aT \ 

r(r) = — sinh —, x(t ) = — I cosh-1 ), y(r) = 0, z(r) = 0, 

a c a V c / 

where r is the proper time of an astronaut on the spaceship. Show that the 4-velocity 
of the rocket in the coordinate system ( ct , x, y, z ) is given by 

r , / at at \ 

|m m J = ( ccosh —, csinh —, 0, 0). 

V c c / 

Hence show explicitly that id u !± = c 2 and that the spaceship’s 3-velocity is 

_ / AT \ 

u = ^ctanh —, 0, 0J . 


5.6 Show that the 4-acceleration of the spaceship in Exercise 5.5 is given by 

/ AT AT \ 

|a^ J = a sinh —, a cosh — ,0,0). 

V c c / 

Hence show that ci ll a /J = a 2 and that the magnitude of the spaceship’s 3-acceleration 
in its own instantaneous rest frame is also a. 

5.7 A spaceship has constant acceleration g in the v-di recti on in its locally comoving 
frame, i.e. the IRF. Show that, in an inertial frame, the spaceship’s 4-velocity [id \ — 
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(u°, u' , 0, 0) and 4-acceleration = (a 0 , a 1 , 0, 0) satisfy a 1 — gu°/c and a 0 = 
gu 1 /c. Show also that 

d_ g 2 u* 
dr 2 c 2 


where r is the proper time as measured by an occupant of the spaceship. A spaceship 
accelerates at a constant rate ^ = 9.5 m s -2 in its own locally comoving frame. 
It starts out towards the centre of the Galaxy lOkpc distant. After going 5kpc 
it decelerates at the same rate to come to rest again at the Galactic centre. The 
outward journey is then repeated in reverse to come back home. Show that, in the 
spaceship’s frame, the elapsed travel time is 41.5 years. What is the elapsed time 
for the waiting observer (or descendants) on Earth? 

5.8 Show that in its own instantaneous rest frame (IRF), a particle’s 4-acceleration is 
given by [a 11 ] = (0, d), where a is the 3-acceleration of the particle in the IRF. 

5.9 Show that, in an inertial frame in which a particle’s 3-acceleration a is orthogonal 
to its 3-velocity u, the particle’s 4-acceleration is given by [a M ] = y 2 (0, 3). 

5.10 Show that when an electron and a positron annihilate, more than one photon must 
be produced. 

5.11 Show that if a photon is reflected from a mirror moving parallel to its plane, then 
the angle of incidence of the photon is equal to the angle of reflection. 

5.12 An inertial frame S' moves with constant velocity u along the .r-axis with respect 
to frame S. A photon in frame S' is fired at an angle 9' to the forward direction of 
motion. Show that the angle 9 measured in frame S is 


tan 9'(l — /3 2 ) 1 / 2 
1 + /3 sec 9' 


where fi= u/c. 

5.13 A photon with energy E collides with a stationary electron whose rest mass is m 0 . 
As a result of the collision the direction of the photon’s motion is deflected through 
an angle 9 and its energy is reduced to E'. Show that 


m 0 c 2 



1 — cos 9. 


Deduce that the wavelength of the photon is increased by 


AA = 



where h is Planck’s constant. At what angle to the initial photon direction does 
the electron move? Show that, if the photon is deflected through a right angle, and 
the photon energy satisfies E <<C m 0 c 2 , then after the interaction the angle of the 
electron’s motion to the direction of the photon’s initial motion is a — —it/A. 

5.14 Inverse Compton scattering occurs whenever a photon scatters off a particle moving 
with a speed very nearly equal to that of light. Suppose that a particle of rest mass 
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m 0 and total energy E collides head on with a photon of energy E . Show that the 
scattered photon has energy 


E 



m lc A \ 1 

4 EE y ) 


Ultra-high-energy cosmic rays have energies up to 10 20 eV. How much energy can 
a cosmic ray proton transfer to a microwave background photon? 

5.15 For a pure 4-force/ acting on a particle of rest mass m 0 , show that the corresponding 
3-force / satisfies 

- _ / • u _ 

f = y u m o a + ~r u - 

c z 


Hence show that a is only parallel to / when / is either parallel or orthogonal 
to u. Show further that, in these two cases, one has / = y 2 m 0 3 and / = y u m 0 a 
respectively. 

5.16 For a pure 4-force/ acting on a particle of rest mass m 0 , show that 


du 

"'o , = J u f- 

dr 

5.17 In Minkowski spacetime, consider an emitter £ moving at speed v along the positive 
x 1 -axis of the frame S in which a receiver X is at rest. Prove the Doppler shift 
formula 



(l-^cosd). 


where 9 is the angle made by the photon trajectory with the x 1 -axis of S. Show that 
this expression can be written in the manifestly covariant way 

A* = 

U 

where k is the photon 4-wavevector and u, : and u x are the 4-velocities of £ and X 
respectively. 

5.18 An astronaut on the space rocket in Exercise 5.5 refers all his measurements to an 
orthonormal tetrad {^(t)} that comprises the basis vectors of a Cartesian instan¬ 
taneous rest frame S' at proper time r. Suppose that at r = 0 the tetrad coincides 
with the fixed basis vectors { e )± } of the (ct, x, y, z) coordinate system in the inertial 
frame S and that the rocket is not rotating in any way. Show that, in the (ct, x, y, z) 
coordinate system, the components of the astronaut’s orthonormal tetrad at some 
later proper time r are 


/ cit cit \ 

e 0 (T) — (^cosh —, sinh —, 0, 0J , 

/ CIT CIT \ 

e ,(r) = ( sinh —, cosh —,0,0), 
V c c / 

e'-Sj) = ( 0 , 0 , 1 , 0 ), 


e'(r) = (0,0,0,1). 
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The astronaut observes photons that were emitted with frequency v 0 from a star that 
is stationary at the origin of S. Show that the frequency of the photons as measured 
by the astronaut at proper time r is given by 


v(t) = t'gexp (—flx/c). 


5.19 At some event P in Minkowski spacetime, the worldline of a particle (either massive 
or massless) and an observer cross. If, at this event, the particle has 4-momentum 
p and the observer has 4-velocity u then show that the observer measures the 
magnitude of the spatial momentum of the particle to be 


\P\ 


{puf 


-PP 


5.20 Repeat Exercise 1.10 using 4-vectors. 

5.21 In Minkowski spacetime, the coordinates ( cT, X, Y,Z) correspond to a Cartesian 
inertial frame. The coordinates ( ct, r, 6, (b) are related to them by the equations 


X — r sin 9 cos</>, Y = r sin 0 sin <r/j, Z—rcostf). 


Obtain the special-relativistic equations of motion of a free particle in the (ct, r, 6, <j>) 
coordinate system, and interpret these equations physically. 

5.22 Repeat Exercise 5.21 for the coordinates (ct, p, <f>, z), that are related to the Cartesian 
inertial coordinates (cT, X, Y, Z) by 

T = t, 

X = p cos (f> cos u>t — p sin </> sin tot, 

Y — p cos (f> sin cot + p sin <fi cos cot, 

Z = z, 


where to is a constant. 
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Electromagnetism 


At the time special relativity was devised only two forces were known, electro¬ 
magnetism and gravity. As mentioned in Chapter 1, it was electromagnetism that 
actually led to the development of special relativity. Therefore, we now discuss 
electromagnetism in some detail; in particular its relativistic formulation. This 
will introduce a number of ideas that we will use later in developing and applying 
a relativistic formulation of gravity, namely general relativity. Our guiding prin¬ 
ciple here is to derive tensorial equations in Minkowski spacetime. This makes 
it possible to express the theory in a form that is independent of the coordinate 
system used. We will see that a consistent theory of electromagnetism follows 
from saying that there exists a pure 4-force that depends linearly on 4-velocity 
and also on a certain property of a particle, namely its charge q. Even if one has 
no prior knowledge of electromagnetism, one can derive the complete theory in 
a few lines using this basic assumption and occasional appeals to simplicity. 


6.1 The electromagnetic force on a moving charge 

In some inertial frame S, the 3-force on a particle of charge q moving in an 
electromagnetic field is 

/ = q{E + u x B), 


where u is the particle’s 3-velocity in S. The 3-vector fields E and B arc the 
electric and magnetic fields as measured in S. This equation suggests that for the 
proper relativistic formulation we should write down a tensor equation in four¬ 
dimensional spacetime in which the electromagnetic 4-force/ depends linearly 
on the particle’s 4-velocity u. Thus we arc led to an equation of the form 


f = qFu, 


( 6 . 1 ) 
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where F must be a rank-2 tensor in order to make a 4-force from a 4-velocity. 
We call F the electromagnetic field tensor. The scalar q is some property of the 
particle that determines the strength of the electromagnetic force upon it (i.e. its 
charge). 

We could develop the theory entirely in terms of coordinate-independent 
4-vectors and 4-tensors. Nevertheless, if we label points in spacetime with some 
arbitrary coordinate system W, we may express (6.1) in component form as 

ffj. = d F P . v u v , 

where the F^ v arc the covariant components of F in our chosen coordinate 
system. In order that the rest mass of a particle is not altered by the action of 
the electromagnetic force we require the latter to be a pure force, so that for any 
4-velocity u we have u f = 0. In component form this reads 

= qF lxv u lx u v = 0, 

which implies that the electromagnetic field tensor must be antisymmetric, i.e. 



The contravariant components of F arc given by 

where the g 111 ’ are the contravariant components of the metric tensor in our 
coordinate system. Since g 111 ' is symmetric, it is clear that F 111 ’ = —F v ^ also. 


6.2 The 4-current density 

So far we have found only the relativistic form of the electromagnetic force on 
an idealised point particle with charge q and 4-velocity u, in terms of some as 
yet undetermined rank-2 antisymmetric tensor F. In order to develop the theory 
further, we must now construct the field equations of the theory, which determine 
the electromagnetic field tensor F(x) at any point in spacetime in terms of charges 
and currents. To construct these field equations, we must first find a properly 
relativistic (or covariant) way of expressing the source term. In other words, we 
need to identify the 4-tensor, defined at each event in spacetime, that acts as the 
source of the electromagnetic field. 

Let us consider some general time-dependent charge distribution. At each 
event P in spacetime we can characterise the distribution completely by giving 
the charge density p and 3-velocity u as measured in some inertial frame. For 
simplicity, let us consider the fluid in the frame S in which u = 0 at P. In this 
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Lorentz contracted in 
direction of motion 


Figure 6.1 The Lorentz contraction of a fluid element in the direction of motion. 

frame, the (proper) charge density is given by p 0 = qn 0 , where q is the charge 
on each particle and n 0 is the number of particles in a unit volume. In some 
other frame S', moving with speed v relative to S, the volume containing a fixed 
number of particles will be Lorentz contracted along the direction of motion (see 
Figure 6.1). Hence in S' the number density of particles is n' = y v n 0 , from which 
we obtain 

P = JvPo- 


Thus we see that the charge density is not a 4-scalar but does transform as the 
0-component of a 4-vector. This suggests that the source term in the electromag¬ 
netic field equations should be a 4-vector. At each point in spacetime, the obvious 
choice is 

j O) = PoU)u(x), 


where p 0 (x) is the proper charge density of the fluid (i.e. that measured by an 
observer comoving with the local flow) and u{x) is its 4-velocity. The squared 
length of this 4-current density j at any event is 

2 2 

JJ = PqC . 

In an inertial frame S the components of the 4-current density j are 
[ f] = Po%,(A m) = (cpj), 

where p is the charge density as measured in S and j is the relativistic 3-current 
density in S. Thus, we see that c 2 p 2 — j 2 is a Lorentz invariant, where j 2 = j ■ j. 
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6.3 The electromagnetic field equations 


We arc now in a position to write down the electromagnetic field equations. The 
simplest way in which to relate the rank-2 electromagnetic field tensor F to the 
4-vector j is to contract F with some other 4-vector. Since there are no more 
physical 4-vectors associated with the theory, the only other 4-vector that the field 
equations can contain is the 4-gradient V. Thus the field equations must be of the 
form 


VT = kj, 


(6.2) 


where k is an unimportant constant related to our choice of units. In order to make 
our final results more familial - , let us work in Cartesian inertial coordinates W 
corresponding to some inertial frame S. In such a system, the covariant derivative 
reduces simply to the partial derivative, and so we can write (6.2) in component 
form as 


d^ v = kf. 


(6.3) 


We can use this field equation to obtain the law for the conservation of charge. 
If we take the partial derivative d v of (6.3), we obtain 


d v d^ v = kd v f. 


(6.4) 


However, since F 111 ' is antisymmetric, we can write the scalar on the left-hand 
side as 


d d F^ v = -d d F Vfl = -d d F^ = -d d F^ 

U V U fl 1 u v u /jl a u /jl u v a l± A ’ 


from which we deduce that d v d tJL F^ v = 0. Thus the right-hand side of (6.4) must 
also be zero, so that 



Using 3-vector notation in the frame S, we may write this in a more familial - way: 


dp 

dt 


+ V-} = 0, 


which expresses the conservation of charge. This equation has the same form as 
the non-relativistic equation of charge continuity, but the relativistic expressions 
for p and j must be used in it. 

It is clear, however, that we do not yet have a viable theory. The field equations 
of the theory are given by (6.3), but there are six independent components in 
F ,JJ ' and only four field equations. Evidently our theory is under-determined as it 
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stands. This suggests that F could be constructed from a 4-vector ‘potential’ A. 
Again working in Cartesian inertial coordinates x ,J \ let us write 


F ixv = d n A v~ d v A n- 


(6.5) 


Thus F /±l , is antisymmetric by construction and contains only four independent 
fields A p . Using the field equation (6.3), we can write 


kh = = d^F\ = -nTd^, 


where we have used the fact that the metric coefficients in Cartesian inertial 
coordinates x 11 arc constants. 1 Hence, by substituting into the expression (6.5), 
we obtain the electromagnetic field equations in terms of the 4-vector potential A 
as 


rT(.3 ll d <r A x -d ll d x A (r ) = kj x . 


(6.6) 


Alternatively, we can express electromagnetism entirely in terms of the electro¬ 
magnetic field tensor F I1V . In this case, we require the two field equations 


= kf, 

d o- F iiv + d v F aii + (),jF v(r = 0, 


(6.7) 


where the second of these is straightforwardly derived from (6.5). Using the 
antisymmetrisation operation described in Section 4.3, the second equation can 
also be written very succinctly as d^F^ = 0. The constant k may be found by 
demanding consistency with the standard Maxwell equations (see Section 6.5). In 
SI units we have k = /x 0 , where e 0 /r 0 = 1/c 2 . 


6.4 Electromagnetism in the Lorenz gauge 

Suppose that we add an arbitrary 4-vector Q to the 4-potential A. Thus, in 
component form (in Cartesian inertial coordinates, x ' 1 , for example) we have 

4 new) = A M+<v ( 6 - 8 ) 

Note that this is not a coordinate transformation. We are still working in the same 
set of coordinates x 11 but have defined a new vector A (new \ whose components 


In fact, such an operation is valid in any coordinate system. As we showed in Chapter 4, the covariant 
derivative of the metric tensor is identically zero, which means that we can interchange the order of index 
raising or lowering and covariant differentiation without affecting the result. 
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in this basis arc given by (6.8). The new electromagnetic field tensor is then 
given by 

^" ew) = <^4 new) - <^r w) = 3 n A r - d » A n + d *Qv - dvQ„- 

Clearly, we will recover the original electromagnetic field tensor provided that 

d »Qv = 

This equation can be satisfied if Q is the gradient of some scalar field i// (say), so 
that Q = 3 f. Thus we have uncovered a gauge freedom in the theory: we arc 
free to add the gradient of any scalar field t// to the 4-vector potential A, giving 

(6.9) 

and still recover the same electromagnetic field tensor and hence the same elec¬ 
tromagnetic field equations. The transformation (6.9) is an example of a gauge 
transformation and, as stated above, is distinct from a coordinate transformation. 
In the field equations 

VP*i?p. d *A x - d^d x A a ) = /r 0 J A , 

the second term on the left-hand side can be written as <9 A d^A^. Thus, we can 
make this term zero by choosing a scalar field >]j such that 

( 6 . 10 ) 

This condition is called the Lorenz gauge. It is worth noting that the condition 
(6.10) is preserved by any further gauge transformation A jX —»■ 4 ;J + (f iji if and 
only if d^d^f — 0. 

Adopting the Lorenz gauge allows the electromagnetic field equations to be 
written very simply as 

TfV ct A a = d^A x = noj K . 

It is usual to write the four-dimensional Laplacian d^d 11 using the notation D 2 = 
d ,J -if = ifjd 11 . where D 2 is the d’Alembertian operator. 2 In Cartesian inertial 
coordinates ( ct, x, y, z), 

2 _ i d 2 d 2 d 2 d 2 

c 2 dt 2 dx 2 dy 2 dz 2 




This operator should properly be written V 2 , which is the inner product V • V of the 4-gradient with itself. 
However, the notation we have adopted is quite common, since it makes clearer the distinction between the 
four-dimensional Laplacian and the three-dimensional Laplacian V 2 = V • V. 
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Then the electromagnetic field equations in the Lorenz gauge take the especially 
simple form 

= W(.. 

together with the attendant gauge condition (6.10). Moreover, in the absence of 
charges and currents, the right-hand side becomes zero and so A /a has wave 
solutions travelling at the speed of light, as do the components of F jXV since in 
this case we also have 0 2 F fJLV = 0. 


6.5 Electric and magnetic fields in inertial frames 

We have not yet identified the components of F (or A) with the familiar electric 
and magnetic 3-vector fields E and B as observed in some Cartesian inertial frame 
S. This is simply a matter of convention; we just have to name the components of 
A (say) in a way which results in 3-vector equations in S that describe the physics 
correctly in terms of the traditionally defined 3-vectors E and B. Thus, in some 
Cartesian inertial frame S, the components of A arc taken to be as follows: 


where <p is the electrostatic potential and A is the traditional three-dimensional 
vector potential. In terms of <p and A, the Lorenz gauge condition becomes 

- - 1 deb 

v -A + -^ = 0, 
c at 

and, in this gauge, the field equations take the form 


□ 2 A = fXnj and □ 2 (j) =—. 

e o 

In terms of <:/) and A, the electric and magnetic fields in S arc given by 

- - - -> - dA 

B = VxA and E — — V0-. (6.11) 

dt 

It is straightforward to show that these equations lead to the Maxwell equations 
in their familiar form. 


- - p - - dB 

VE=—, Vx£=——, 

e 0 dt 


VB = 0, 


- - - dE 

VxB = n 0 j + ii 0 e 0 —. 

dt 
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From the expressions (6.11) and (6.5) we have 


E‘ = -8 ij d j( j) - cd 0 A' = -c8 ij (djA° - d Q Aj ) = -c8 ij F j0 , 
where we have used the fact that A° = rf v A v = A 0 . Also, we have 
B l = d 2 A 3 — d 3 A 2 = d 3 A 2 — d 2 A 3 = F 32 , 

where we have used the fact that A' = r] n ’A v = —A,-. Similar results hold for B 2 
and B 3 . Thus we find that the covariant components of F in .S’ are given by 


\Fu.v\ = 


0 

E l /c 

E 2 /c 

E 3 /c 

-E l /c 

0 

-B 3 

B 2 

-E 2 lc 

B 3 

0 

-B { 

—E 3 /c 

—B 2 

S 1 

0 


The corresponding electric and magnetic fields E’ and B’ in some other Cartesian 
inertial frame S' arc most easily obtained by calculating the components of the 
electromagnetic field tensor F or the 4-potential A in this frame. For example, if S' 
is moving at speed v relative to S in standard configuration then the components 
in S' are given by 

A' IX = A » V A V and F ,flv = A^A v p F ap , 

where the matrix [A^„] is given in Chapter 5. 


6.6 Electromagnetism in arbitrary coordinates 

So far we have developed electromagnetic theory in Cartesian inertial coordinates. 
In general, however, we arc free to label points in the Minkowski spacetime using 
any arbitrary coordinate system x 11 . We could have developed the entire theory 
in such an arbitrary system, or even in a coordinate-independent way by using 
the 4-tensors themselves rather than their components in some coordinate system. 
Nevertheless, having expressed the theory in Cartesian inertial coordinates, it is 
now trivial to re-express it in a form valid in arbitrary coordinates. 

As shown in (6.7), the electromagnetic field equations in Cartesian inertial 
coordinates, when expressed in terms of F, are given by 

3 ^ = 110 r, 

' ( KtF ixv + d v F (r/1 + d lx F v(T = 0. 
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In such a coordinate system, the partial derivative <l jJL is identical to the covariant 
derivative V^, so we can rewrite these equations as 


V<7^u, V + VyFcTfJL + Fy<T = 0. 


( 6 . 12 ) 


These new equations are now fully covariant tensor equations, however, so that if 
they are valid in one system of coordinates then they are valid in all coordinate 
systems. Thus, (6.12) gives the electromagnetic field equations in an arbitrary 
coordinate system! Once again, using the antisymmetrisation operation discussed 
in Section 4.3, one can write the second equation simply as = 0. 

A similar procedure can be performed for the electromagnetic field equations 
when expressed in terms of the 4-vector potential A. From (6.6), in Cartesian 
inertial coordinates we have 

- Va A <x) = WA- 

Once again, we can replace d l± by V M , but in this case we must also replace j] IJ,r 
by ;f J<T . to obtain 


— ^/x^A^o-) — /%/A- 


Again we have a fully covariant tensor equation, which must therefore be valid 
in any arbitrary coordinate system, the metric coefficients of which arc g !JJT . 

In arbitrary coordinates, the electromagnetic field equations still permit the 
gauge transformation 


A (^) = Aii + V iA = Aii + diAi 

where the last equality holds because the covaiiant derivative of the scalar field 
ifj is simply its partial derivative. We can again choose a scalar field iff, so that 


V^ = o, 


which is the Lorenz gauge condition in arbitrary coordinates. In this case the 
electromagnetic field equations can again be written in the form 


^2 A ^ dojfx , 
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but now the d’Alembertian operator is given by D 2 = g /x ''V /J V, ; = V ;J . In vacuo, 
we may again write D 2 A /± = 0 and \3 2 F flv = 0. Also, charge conservation is given 
in arbitrary coordinates by 


V = o. 


Finally, we note that the components of F and A in two different arbitrary 
coordinate systems x p and x 11 arc related by 

. dx' p , dx' p dx! v 

A' 11 = - A 1 and F' pv =- F p . 

dx v dx a dxP 


6.7 Equation of motion for a charged particle 


From our original considerations in Section 6.1, we see that the coordinate- 
invariant manner of writing the equation of motion of a charged particle in an 
electromagnetic field is 


dp du 

— =m 0 — =qF-u, 

<7T uT 


where m 0 is the rest mass of the particle, p is its 4-momentum, u is its 4-velocity 
and t is the proper time measured along its worldline. Note that the first equality 
holds because the electromagnetic force is a pure force. 

In Cartesian inertial coordinates, this becomes 


du 11 

m 0 —— = qF p v u . 
dx 


In a general coordinate system, however, the left-hand side is no longer valid 
since the ordinary derivative of the components of the 4-velocity along the parti¬ 
cle’s worldline must be replaced by the intrinsic derivative along the worldline. 
Using the expression for the intrinsic derivative given in Chapter 3, we find 
that in an arbitary coordinate system the equation of motion of a particle in an 
electromagnetic field is 


m 0 


Du p 

Dt 


m 0 


Idu p 
\ dr 


' ± va 



qF p v u v , 


where we have written dx a /dr as u IT since the 4-velocity is the tangent to the 
particle’s worldline x p (t). 

The equation for the particle’s worldline in arbitrary coordinates is thus given by 


d 2 x p dx v dx u q dx v 

_l pA__ _L pu- _ 

dr 2 va dr dr m 0 v dr 


(6.13) 
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In the absence of an electromagnetic field (or for an uncharged particle), the 
right-hand side is zero and we can recognise the result as the equation of a 
geodesic. 

In summary, the general procedure for converting an equation valid in Cartesian 
inertial coordinates into one that is valid in an arbitrary coordinate system is as 
follows: 

• replace partial derivatives with covariant derivatives: 

• replace ordinary derivatives along cun’es with intrinsic derivatives: 

• replace 17 ^ by g^. 


Exercises 

6.1 Show that the second Maxwell equation in (6.7) can be written as d [cr F r.v\ ~ °- 

6.2 Show that the Maxwell equation ( 6 . 6 ) is unchanged under the gauge transformation 
(6.9). 

6.3 In some Cartesian inertial frame S , the contravariant components of the electric and 
magnetic fields are E' and B‘ respectively. Show that the corresponding electromag¬ 
netic field-strength tensor has the contravariant components 

E l /c —E 2 /c —E 3 / c\ 

0 —B 3 B 2 

B 3 0 —B l 

-B 2 B l 0 ) 

6.4 In a Cartesian inertial coordinate system in Minkowski spacetime the field equations 
of electromagnetism can be written 

d ^ v = do /> 
d<r F ii.v + dvF<r^ + = 0 - 

Show that these equations are equivalent to the standard form of Maxwell’s equations 
in vacuo. 

6.5 Two Cartesian inertial frames S and S 1 are in standard configuration. Show that the 
components of electric and magnetic fields in the two frames are related as follows: 

,1 i B ,X =B\ 

E l = E\ 

E , 2 =y{E 2 -vB 3 ), B ’ 2 = y ( b2 + ^ £3 ) > 

E r3 = y(E 3 + vB 2 ), B' 3 = y( y B 3 -- l E i y 

Show further that c 2 B 2 — E 2 is Lorentz invariant. 


[FH = 


( 0 - 

E l /c 
E 2 /c 
\E 3 /c - 
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6.6 Show that the transformation equations derived in Exercise 6.5 can be written as 


E'-E E' ± = y{E ± + vxB ± ), 

mi-mo 

B' ± = v(b±--^vxE ± ), 


where v — (i>, 0, 0), and Ey and E_ denote the projections of E parallel and orthog¬ 
onal to v respectively (and similarly for B). Explain why these equations must hold 
for a Lorentz boost v in an arbitrary direction with respect to the axes of S. 

6.7 Show that one may eliminate the explicit reference to the projections of E and B 
in Exercise 6.6 and write the transformations as 

-> - - l — y ^ ^ 

E —y(E + vxB)-\ - -^-(v-E)v, 

v 

B' = y — v x E^j H- -^-(v-B)v. 


6.8 Show that E ■ B is a Lorentz invariant. 

6.9 In an arbitrary coordinate system, the second Maxwell equation reads 


v ff E^ + v^ + Vv = o. 

Show that this can be written as 


and hence show that V^E^y = 0. 

6.10 In Cartesian inertial coordinates, the equation of motion for a charged particle in 
an electromagnetic field is 

du^ 

m 0 -= qF^ v it . 

dr 

Show that 

dp ^ d£ - _ 

— —q(E+uxB) and — =qE-u, 
dt dt 

where p and £ are the 3-momentum and the energy respectively of the particle in 
S. Interpret these results physically. 

6.11 In some inertial frame S, show that the 3-acceleration of a charged particle in an 
electromagnetic field is 


du 

dt 


q 

ym 0 


E + uxB -— (Ti • E)u 
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The equivalence principle and spacetime curvature 


We arc now in a position to use the experience gained in deriving a relativistic 
formulation of electromagnetism (together with some flashes of inspiration from 
Einstein!) to begin our formulation of a relativistic theory of gravity, namely 
general relativity. 


7.1 Newtonian gravity 

In our development of electromagnetism, we began by considering the electro¬ 
magnetic 3-force on a charged particle. Let us therefore start our discussion of 
gravity by considering the description of the gravitational force in the classical, 
non-relativistic, theory of Newton. In the Newtonian theory, the gravitational 
force / on a (test) particle of gravitational mass m G at some position is 

/ = «?g£ = 


where g is the gravitational field derived from the gravitational potential at that 
position. In turn, the gravitational potential is determined by Poisson’s equation: 


v 2 d> = 4 irGp, 


(7.1) 


where p is the gravitational matter density and G is Newton’s gravitational 
constant. This is the field equation of Newtonian gravity. 

It is clear from (7.1) that Newtonian gravity is not consistent with special 
relativity. There is no explicit time dependence, which means that the potential 
d> (and hence the gravitational force on a particle) responds instantaneously to a 
disturbance in the matter density p; this violates the special-relativistic requirement 
that signals cannot propagate faster than c. We might try to remedy this by noting 
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that the Laplacian operator V 2 in (7.1) is equivalent to minus the d’Alembertian 
operator D 2 in the limit c — oo, and thus postulate the modified field equation 

□ 2 d> = —47rGp. 


However, this equation does not yield a consistent relativistic theory. It is still 
not Lorentz covariant, since the matter density p does not transform as a Lorentz 
scalar. We shall discuss the transformation properties of the matter density later. 

In addition to the incompatibility of Newtonian gravity with special relativity, 
there is a second fundamental difference between the electromagnetic and grav¬ 
itational forces. The equation of motion of a particle of inertial mass m { in a 
gravitational field is given by 


d 2 x m G 
= 


(7.2) 


It is a well-established experimental fact, however, that the ratio m G /»q appealing 
in the equation of motion is the same for all particles. By an appropriate choice of 
units one may thus arrange for this ratio to equal unity. In contrast, the ratio q/m j 
occurring in the equation of motion of a charged particle in an electromagnetic 
field is not the same for all particles. From (7.2), we thus see that the trajectory 
through space of a particle in a gravitational field is independent of the nature of 
the particle. 

This equivalence of the gravitational and inertial masses (which allows us to 
refer simply to ‘the mass’), is a truly remarkable coincidence in the Newtonian 
theory. In this theory, there is no a-priori reason why the quantity that determines 
the magnitude of the gravitational force on the particle should equal the quantity 
that determines the particle’s ‘resistance’ to an applied force in general. It appeal's 
as an isolated experimental result, which has since been verified to an accuracy 
of at least one part in 10 11 (by Dicke and co-workers). 


7.2 The equivalence principle 

The equality of the gravitational and inertial masses of a particle led Einstein 
to his classic ‘elevator’ thought experiment. Consider an observer in a freely 
falling elevator (i.e. after the lift cable has been cut). Objects released from 
rest relative to the elevator cabin remain floating ‘weightless’ in the cabin. 
A projectile shot from one side of the elevator to the other appeal's to move 
in a straight line at constant velocity, rather than in the usual curved trajectory. 
All this follows from the fact that the acceleration of any particle relative to 
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the elevator is zero: the particle and the elevator cabin have the same accelera¬ 
tion relative to the Earth as a result of the equivalence of gravitational and inertial 
mass. 

All these observations would hold exactly if the gravitational field of the Earth 
were truly uniform. Of course, the gravitational field of the Earth is not uniform 
but acts radially inwards towards its centre of mass, with a strength proportional 
to 1 /r 1 2 . Thus, if the elevator were left to free-fall for a long time or if it were very 
large (i.e. a significant fraction of the Earth’s radius), two particles released from 
rest near the walls of the elevator would gradually drift inwards, since they would 
both be falling along radial lines towards the centre of the Earth (see Figure 7.1). 
Furthermore, as a result of the varying strength of the gravitational field, particles 
released from rest near the floor of the elevator would gradually drift downwards 
whereas those near the ceiling would drift upwards. What the observer in the 
elevator would be experiencing would be the tidal forces resulting from the 
residual inhomogeneity in the strength and direction of the gravitational field once 
the main acceleration has been subtracted. It should always be remembered that 
these tidal forces can never be completely abolished in an elevator (laboratory) 
of finite, i.e. non-zero, size. 

Nevertheless, provided that we consider the elevator cabin over a short time 
period and that it is spatially small , then a freely falling elevator (which may 
have (x, y, z) coordinates marked on its walls and an elevator clock measuring 
time t) resembles a Cartesian inertial frame of reference, and therefore the laws 
of special relativity hold inside the elevator} These observations lead to 

The equivalence principle: In a freely falling (non-rotating) laboratory occupying 
a small region of spacetime, the laws of physics are those of special relativity ? 


7.3 Gravity as spacetime curvature 

These observations led Einstein to make a profound proposal that simultaneously 
provides for a relativistic description of gravity and incorporates in a natural way 
the equivalence principle (and consequently the equivalence of gravitational and 
inertial mass). Einstein’s proposal was that gravity should no longer be regarded 
as a force in the conventional sense but rather as a manifestation of the curvature 
of the spacetime, this curvature being induced by the presence of matter. This is 
the central idea underpinning the theory of general relativity. 


1 The elevator cabin must not only occupy a small region of spacetime but also be non-rotating with respect to 
distant matter in the universe. This statement is related to Mach’s principle. 

2 This is in fact a statement of the strong equivalence principle, since it refers to all the Laws of physics. The 
more modest weak equivalence principle refers only to the trajectories of freely falling particles. 
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Figure 7.1 An elevator in free-fall towards the Earth. 


If gravity is regarded a manifestation of the curvature of spacetime itself, and 
not as the action of some 4-force/ defined on the manifold then the equation of 
motion of a particle moving only under the influence of gravity must be that of a 
‘free’ particle in the curved spacetime, i.e. 


dr 


where p is the particle’s 4-momentum and t is the proper time measured along the 
particle’s worldline. Thus, the worldline of a particle freely falling under gravity 
is a geodesic in the curved spacetime. 

The equivalence principle restricts the possible geometry of the curved space- 
time to pseudo-Riemannian, as follows. The mathematical meaning of the equiv¬ 
alence principle is that it requires that at any event P in the spacetime manifold 
we must be able to define a coordinate system X 11 such that, in the local neigh¬ 
bourhood of P. the line element of spacetime takes the form 

ds 2 ~ rj^ v dX M dX v , 

where exact equality holds at the event P. From the geodesic equation (as shown 
in Chapter 5), in such a coordinate system the path of a ‘free’ particle, i.e. one 
moving only under the influence of gravity, in the vicinity of the event P is 
given by 


d 2 X‘ 
~dT 2 


0 , 


where i — 1,2,3 and we have denoted X° by cT (once again the equality in the 
above equations holds exactly at P). Thus, in the vicinity of P the coordinates 
define a local Cartesian inertial frame (like our small elevator considered over a 
short time interval), in which the laws of special relativity hold locally. In order 
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that we can construct such a system, spacetime must be a pseudo-Riemannian 
manifold (which is curved and four-dimensional). For such a manifold, in some 
arbitrary coordinate system x M the line element takes the general form 



7.4 Local inertial coordinates 

The curvature of spacetime means that it is not possible to find coordinates in 
which the metric g^ v = tj at all points in the manifold. Thus, it is not possible 
to define global Cartesian inertial frames as we could in the pseudo-Euclidean 
Minkowski spacetime. Instead, we arc forced to use arbitrary coordinate systems 
x' 1 to label events in spacetime, and these coordinates often do not have simple 
physical meanings. It is often the case that x° is a timelike coordinate and the 
x'(i = 1, 2, 3) arc spacelike (i.e. the tangent vector to the x° coordinate curve is 
timelike at all points, and similarly the tangent vectors to the x l coordinate curves 
are always spacelike). This allocation of coordinates is not necessary, however, 
and it is sometimes useful to define null coordinates. In any case, the arbitrary 
coordinates x M need not have any direct physical interpretation. 

Nevertheless, as demanded by the equivalence principle, problems of physical 
meaning can always be overcome by transforming, at any event P in the curved 
spacetime, to a local inertial coordinate system X ,x , which, in a limited region of 
spacetime about P, corresponds to a freely falling, non-rotating, Cartesian frame 
over a short time interval. Mathematically, this corresponds to constructing about 
the event P a coordinate system X 11 such that 

(7.3) 

This also means that F 11 „ ir (P) = 0 and that the coordinate basis vectors at the 
event P form an orthonormal set, i.e. 

(7.4) 

There are in fact an infinite number of local inertial coordinate systems at P, all 
of which arc related to one another by Lorentz transformations. In other words, 
if a coordinate system X' 1 satisfies the conditions (7.3), and hence the condition 

(7.4), then so too will the coordinate system 

X' IJ - = Ai\X v , 

where A 11 „ defines a Lorentz transformation. Thus, local Cartesian freely falling 
(non-rotating) frames at an event P arc related to one another by boosts, spatial 
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rotations or combinations of the two. For any one of these coordinate systems, the 
timelike basis vector e 0 (P) is simply the normalised 4-velocity vector u(P) of the 
origin of that frame at the event P, and the three mutually orthogonal spacelike 
vectors efP)(i = 1,2, 3) define the orientation of the spatial axes of the frame. 

For points near to P, the metric in a local inertial coordinate system X' 1 (whose 
origin is at P) is given by 

8n v = Vuv + ^(d 0 -d p ,g /Ltl ,)pX cr X p 4-• 

The sizes of the second derivatives (do-dpg^p thus determine the region over 
which the approximation v *=» v remains valid. We shall see the significance 
of these second derivatives shortly. 


7.5 Observers in a curved spacetime 

We discussed the subject of observers in Minkowski spacetime in Chapter 5, but 
let us now consider the subject in its full generality, in a curved spacetime. An 
observer will trace out some general (timelike) worldline x /x (r) through spacetime, 
as expressed in some arbitrary coordinate system, where t is the observer’s proper 
time. An idealisation of his local laboratory is a frame of four orthonormal vectors 
e a (f) (or tetrad) satisfying 

C*( t )-£/ 3( t ) = ''M’ 


which arc carried with him along his worldline (these vectors may, in general, 
be totally unrelated to the basis vectors <? yj of the coordinate system that we are 
using to label points in spacetime, although we can always express one set of 
vectors in terms of the other). In particular, at any point along his worldline the 
timelike vector e 0 (r) coincides with the normalised 4-velocity «(r) = u(f)/c of 
the observer. Similarly, the evolution of the spacelike vectors efr) along the 
worldline reflect the different ways in which his local laboratory may be spinning 
or tumbling. Quantities measured in this laboratory correspond to projections of 
the relevant physical 4-vectors and 4-tensors onto this orthonormal frame. 

As shown in Chapter 5, if the observer has a 4-acceleration o,{t) = du/dT but 
is not rotating, the tetrad basis vectors arc Fermi-Walker-transported along the 
observer’s worldline: 


de. x 1 

= f2 U“ • e ^ a “ (« • ■ 


(7.5) 


This expression holds equally well in a curved spacetime. An important special 
case is that of a non-rotating, freely falling observer, i.e one who is moving only 
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under the influence of gravity. The vectors e a (j) then define what is called a 
freely falling frame (FFF). Free from any external forces, the observer’s worldline 
traces out a geodesic in the curved spacetime. Thus the timelike vector e () changes 
with proper time along the worldline according to 

de 0 _ Q 

dr 

In other words, e 0 is parallel-transported along the worldline, and the observer’s 
4-acceleration a is zero. In this case we see from (7.5) that Fermi-Walker transport 
reduces to parallel transport. Thus the spacelike frame vectors e,- (1=1, 2, 3) are 
also parallel-transported along the geodesic, so that 



Hence, in an arbitrary coordinate system x 11 , the components (e a )^(T) = e a (~) 
of any frame vector evolve as follows: 


D(e a r 

Dt 


d{e a Y 

dr 




= 0 . 


This equation is extremely useful for determining what a freely falling observer 
would measure at a given event in spacetime. It is also clear that the frame vectors 
e a at any event P along the observer’s worldline arc the basis vectors of a local 
Cartesian inertial coordinate system at P. 


7.6 Weak gravitational fields and the Newtonian limit 

It is clear that, by construction, our description of gravity in terms of spacetime 
curvature reduces to special relativity in local inertial frames. It is important to 
check, however, that such a description also reduces to Newtonian gravity in the 
appropriate limits. 

In the absence of gravity, spacetime has a Minkowski geometry. Therefore a 
weak gravitational field corresponds to a region of spacetime that is only ‘slightly’ 
curved. In other words, in such a region there exist coordinates x 11 in which the 
metric takes the form 


guv = Vuv + V’ where IVI 1 ■ 


(7.6) 


Note that it is important to say ‘there exist coordinates’ since (7.6) does not hold 
for all coordinates; as we saw in Chapter 5, one can find coordinates even in 
Minkowski space in which g^ is not close to the simple form p^,,. Let us assume 
that in the coordinate system (7.6) the metric is stationary, which means that all 




154 


The equivalence principle and spacetime curvature 


the derivatives arc zero. An example of such a coordinate system might be 
a fixed Cartesian frame at some point on the surface of the (non-rotating) Earth. 

The worldline of a particle freely falling under gravity is given in general by 
the geodesic equation 

drx^ clx v dx a 

dr 2 V<T dr dr 

We shall assume, however, that the particle is slow-moving, so that the compo¬ 
nents of its 3-velocity satisfy dx‘/dt c(i = 1,2,3), where t is defined by 
x° = ct. This is equivalent to demanding that, for i — 1, 2, 3, 

dx' dx° 
dr ^ dr 


Thus we can ignore the 3-velocity terms in the geodesic equation to obtain 


drxv 
dr 2 


+ r A oo c “ 



(7.7) 


Now, recalling the expression (3.21) giving the connection in terms of the metric 
and using the form (7.6) for g^ v , we find that the connection coefficients I' M 0() 
are given by 

I^oo = \g Kf *(do8o K + doSoK ~ ^goo) = -^g KM 4goo = oo. 


where the last equality is valid to first order in li /lp . Since we have assumed that 
the metric is stationary, we have 

r°oo = 0 ar *d r' 00 = i S' 2 djh 00 , 


where the Latin index runs over i = 1,2, 3. Inserting these coefficients into (7.7) 
gives 



and 


cl 2 x 

dr 2 



V/^oo- 


The first equation implies that dt/dT = constant, and so we can combine the two 
equations to yield the following equation of motion for the particle: 


d 2 x 
dt 2 


— ^C 2 V/l 0() . 


If we compare this equation with the usual Newtonian equation of motion for 
a particle in a gravitational field (7.2), we see that the two arc identical if we 
make the indentification h 00 = 2$>/c 2 . Hence for a slowly moving particle our 
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description of gravity as spacetime curvature tends to the Newtonian theory if the 
metric is such that, in the limit of a weak gravitational field, 



(7.8) 


How big is the correction to the Minkowski metric? Some values of <b/c 2 for 
various systems are as follows: 


<f> 


GM 


— I 0~ 9 at the surface of the Earth 

—10~ 6 at the surface of the Sun 

—10~ 4 at the surface of a white dwarf star. 


Thus, we see that even at the surface of a dense object like a white dwarf, the 
size of <&/c 2 is much smaller than unity and hence the weak-field limit will be 
an excellent approximation. 

From (7.8), the observant reader will have noticed that the description of gravity 
in terms of spacetime curvature has another immediate consequence, namely that 
the time coordinate t does not, in general, measure proper time. If we consider a 
clock at rest at some point in our coordinate system (i.e. dx'/dt = 0), the proper 
time interval dr between two ‘clicks’ of the clock is given by 

c 2 dr 2 = g^ v dx M dx v = g 0 o c2 ^ 2 > 


from which we find that 


dr = 



This gives the interval of proper dr corresponding to an interval dt of coordinate 
time for a stationary observer near a massive object, in a region where the 
gravitational potential is d>. Since d> is negative, this proper time interval is shorter 
than the corresponding interval for a stationary observer at a large distance from 
the object, where <J> —> 0 and so ch = dt. Thus, as a bonus, our analysis has also 
yielded the formula for time dilation in a weak gravitational field. 


7.7 Electromagnetism in a curved spacetime 

Before going on to discuss the mathematics of curvature in detail, let us look 
back at our development of electromagnetism in Chapter 6. It is clear - that our 
derivation of the electromagnetic field equations in arbitrary coordinates did not 
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depend on the intrinsic geometry of the manifold on which the electromagnetic 
field tensor F and the 4-current j arc defined. In other words, one can arrive at 
these equations without assuming the spacetime to have a Minkowski geometry. 
Thus, in the presence of gravitating matter, spacetime becomes curved but the 
field equations of electromagnetism in an arbitrary coordinate system arc still 
given by 


V^ = /%/, 

+ ^/J^vrr = 0. 


(7.9) 


The effects of gravitation arc automatically included in these field equations 
through the covariant derivatives, which depend on the metric g /lv describing 
the spacetime geometry. Moreover, if we construct a local Cartesian coordinate 
system about some point P in the manifold then (as discussed above) these 
coordinates correspond to a local inertial frame in the neighbourhood of P. In these 
coordinates, the equations of electromagnetism then take their familial - special 
relativistic forms. 

An electromagnetic field tensor F defined on a curved spacetime gives rise (as 
in Minkowski space) to a 4-force / = qF u, which acts on a particle of charge 
q with 4-velocity u. Thus the equation of motion of a charged particle moving 
under the influence of an electromagnetic field in a curved spacetime has the 
same form as that in Minkowski spacetime, i.e. 


du 

m Q — = qF ■ u, 
dr 


where m 0 is the rest mass of the particle. In this case, however, because of 
the curvature of spacetime the particle is moving under the influence of both 
electromagnetic forces and gravity. In some arbitrary coordinate system, the 
particle’s worldline is again given by 


d 2 x M ^ dx v dx a q ^ /JL dx v 
cIt 2 ^ vtT dr dr m 0 v dr 


Obviously, in the absence of an electromagnetic field (or for an uncharged parti¬ 
cle), the right-hand side is zero and we recover the equation of a geodesic. 

We must remember, however, that the energy and momentum of the electro¬ 
magnetic field will itself induce a curvature of spacetime, so the metric in this 
case is determined not only by the matter distribution but also by the radiation. 
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7.8 Intrinsic curvature of a manifold 

Since the notion of curvature is central to general relativity, we must now inves¬ 
tigate how to quantify the intrinsic curvature of a manifold at any given point P? 
A manifold (or region of a manifold) is flat if there exist coordinates X 11 such 
that, throughout the region, the line element can be written 

ds 2 = e, (dX 1 ) 2 + e 2 (dX 2 ) 2 + • • • + e N (dX N ) 2 , (7.10) 

where e a = ±1 (in other words ‘flat’ is a shorthand for pseudo-Euclidean). If, 
however, points in the manifold arc labelled with some arbitrary coordinate system 
x a then in general the line element ds 2 will not be of the above form. Thus, if for 
some manifold the line element is given by 

ds 2 = g a b(x) dx a dx b , 

how can we tell whether the intrinsic geometry of the manifold in some region is 
flat or curved in some way? 

Consider, for example, the following line element for a three-dimensional space: 
ds 2 = dr 2 + r 2 dd 2 + r 2 sin 2 9 dfr. 

Of course, we recognise this as the line element of ordinary three-dimensional 
Euclidean space written in spherical polar coordinates. In other words, the trans¬ 
formation 


x = r sin 6 cos cf), y = r sin 0sin f>, z = r cos 9 

will turn the above line element into the form 

ds 2 = dx 2 + dy 2 + dz 2 . (7-11) 

But what about other line elements? For example, recall from Chapter 2 the 
three-dimensional space described by the line element (2.21): 

ds 2 = ——- dr 2 + r 2 dd 2 + r 2 sin 2 9 dtp 2 , 
a- — r- 

How can we tell whether this metric, or a more complicated metric, corresponds 
to flat space but merely looks complicated because of a weird choice of coordi¬ 
nates? It would be immensely tedious to try to discover whether there exists a 
coordinate transformation that reduces a metric to the form (7.11). We therefore 
need some means of telling whether a manifold is flat directly from the metric 
g ah , independently of the coordinate system being used. 


Since the material presented here is applicable to any Af-dimensional pseudo-Riemannian manifold, we will 
use indices a , b etc. that have a range 1 to N, rather than /x, v etc., with a range 0 to 3. Of course, the final 
application to general relativity will govern the scope of our results. 
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The physical significance of this to general relativity is as follows. If, throughout 
some region of a four-dimensional spacetime, we can reduce the line element 

ds 2 = g^ v dx M dx v 

to Minkowski form then there can be no gravitational field in this region. The 
equivalence of a general line element to that of Minkowski spacetime therefore 
guarantees that the gravitational field will vanish. The solution to our mathematical 
problem of finding a coordinate-independent way of defining the curvature of 
spacetime will lead us to the field equations of gravity. 


7.9 The curvature tensor 

We can find a solution to the problem of measuring the curvature of a manifold at 
any point by considering changing the order of covariant differentiation. Covariant 
differentiation is clearly a generalisation of partial differentiation. There is one 
important respect in which it differs, however: it matters in which order covariant 
differentiation is performed, and changing the order (in general) changes the result. 

Since for a scalar field the covariant derivative is simply the partial deriva¬ 
tive, the order of differentiation does not matter. However, let us consider some 
arbitrary vector field defined on a manifold, with covariant components v a . The 
covariant derivative of the v a is given by 

^b^a ^ b V a F ab V d . 

A second covariant differentiation then yields 

v c v fe u a = d c (v b v a ) - r e ac w b v e - r e bc v e v a 

= d c d b v a — {d c T d ab )v d — T d ab d c v d 

~ I" e a c(d b v e ~ ^eb v d) ~ r% c (<^f a ~ ^ae v d)> 


which follows since V b v a is itself a rank-2 tensor. Swapping the indices b and c 
to obtain a corresponding expression for V b V c v a and then subtracting gives 

VcVb v a-Vb\ v a = Rd abc v d’ C 7 - 12 ) 


where 


= d h T a 


■d r r', b + r e „ r r d 


T e h T a 


(7.13) 


To determine directly whether the N 4 quantities R d abc transform as the compo¬ 
nents of a tensor under a coordinate transformation would be an arduous algebraic 
task. Fortunately the quotient theorem (Section 4.11) provides a much shorter 
route. The left-hand side of (7.12) is a tensor, for arbitrary vectors v a , so the 
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contraction of R d abc with v d is also a tensor. Since R d abc does not depend on v a , 
we conclude from the quotient theorem that the R d abc are indeed the components 
of some rank-4 tensor R. This tensor is called the curvature tensor (or Riemann 
tensor), and equation (7.13) shows that it is defined in terms of the metric tensor 
g ab and its first and second derivatives. 

We must now establish how the tensor (7.13) is related to the curvature of the 
manifold. In a flat region of a manifold, we may choose coordinates such that 
the line element takes the form (7.10) throughout the region. In these coordinates 
T a bc an d hs derivatives arc zero, and hence 

R d abc — 0 

at every point in the region. This is a tensor relation, however, and so it must hold 
in any coordinate system. Conversely, if R d abc = 0 at every point in some region 
of a manifold, then it may be shown that it is possible to introduce a coordinate 
system in which the line element takes the form (7.10), and hence this region 
is flat. 4 Thus the vanishing of the curvature tensor is a necessary and sufficient 
condition for a region of a manifold to be flat. 


7.10 Properties of the curvature tensor 

The curvature tensor (7.13) possesses a number of symmetries and satisfies certain 
identities, which we now discuss. The symmetries of the curvature tensor arc most 
easily derived in terms of its covariant components 

^abed oae 1 '- bed * 

For completeness, we note that in an arbitrary coordinate system an explicit form 
for these components is found, after considerable algebra, to be 

R abed 2 (^d^aSbc ^d^bSac ^c^bSad ^c^aSbd) S ( ^ 'eac ^ 'fhd ^ 'cad ^ '/he) • 

One could use this expression straightforwardly to derive the symmetry properties 
of the curvature tensor, but we take the opportunity here to illustrate a general 
mathematical device that is often useful in reducing the algebraic burden of tensor 
manipulations. 

Let us choose some arbitrary point P in the manifold and construct a geodesic 
coordinate system about this point (see Section 3.11), in which the connec¬ 
tion vanishes, T a bc (P) = 0, although in general its derivatives will not. In this 


For a proof of this result, see (for example) P. A. M. Dirac, General Theory of Relativity , Princeton Landmarks 
in Physics Series, Princeton University Press, 1996. 
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coordinate system, one may easily show directly from (7.13) that the covariant 
components of the curvature tensor at P arc given by 

(J^abcd)p 2 (^d^aSbc ^d^bSac "f" ^c^bSad ^ c^aSbd) P ■ 


From this expression one may immediately establish the following symmetry 
properties at P: 

(7.14) 

(7.15) 

(7.16) 


D _ _ D 

^abed ^ bacd ’ 

n _ _d 

^abed ^ abdc ’ 

n _ n 

abed ^ edab * 


The first two properties show that the curvature tensor is antisymmetric with 
respect to swapping the order of either the first two indices or the second two 
indices. The third property shows that it is symmetric with respect to swapping 
the first pair of indices with the second pair of indices. Moreover, we may also 
easily deduce the cyclic identity 


Rabed ^aedb Radbc ^' 


(7.17) 


which on using (7.15) may be written more succinctly as R a [ bc( j] = 0. Although 
the results (7.14-7.17) have been derived in a special coordinate system, each 
condition is a tensor relation and so if it is valid in one coordinate system then it is 
valid in all. Moreover, since the point P is arbitrary, the results hold everywhere. 

Although first appearances might suggest that the curvature tensor has N 4 
components, the conditions (7.14-7.17) reduce the number of independent compo¬ 
nents to N 2 (N 2 — 1)/12. Recall from Section 2.11 that this is also the number of 
degrees of freedom among the second derivatives d d d c g ab . This is not surprising 
since, at any point P in a manifold, we can perform a transformation to local 
Cartesian coordinates in which g ab (P) = 17 ^ and (d c g ab ) P = 0. Thus, a general 
metric at any point P is characterised by the N 2 (N 2 — 1)/12 second derivatives 
that cannot be made to vanish there. 

For manifolds of different dimensions we have the following results: 


No. of dimensions 2 3 4 

No. of independent components of R abcd 1 6 20 


You can see from this table that in four dimensions the number of independent 
components is reduced from a possible 256 to 20. You will also see that in one 
dimension the curvature tensor is always equal to zero: R lm =0. Flow can this 
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be? Can a line not be curved? Think about this - the curvature measures the 
‘inner’ properties of the space. When we say that a line is curved we refer to 
a particular embedding in a higher-dimensional space, but this does not tell us 
about the inner properties of the space. In one dimension, it is evident that we 
can always find a coordinate transformation that will reduce an arbitrary metric to 
the form (7.10). As a two-dimensional example, in Appendix 7A we calculate the 
single independent component of the curvature tensor for the surface of a sphere. 
The Gaussian curvature K of a two-dimensional surface is given by 

K _ ^1212 

8 

where g = dct \g ah ] is the determinant of the metric tensor. 

The curvature tensor also satisfies a differential identity, which may be derived 
as follows. Let us once again adopt a geodesic coordinate system about some 
arbitrary point P. In this coordinate system, differentiating and then evaluating 
the result at P gives 


(ye^-abcd) P (^ e R a bcd) P (.^e^c^abd ^e^d^abc) P • 


Cyclically permuting c, d and e to obtain two further analogous relations and 
adding, one finds that at P 


^ e^abcd 4~ ^ 'c^abde “h ^ d^abec 


(7.18) 


This is, however, a tensor relation and thus holds in all coordinate systems; 
moreover, since P is arbitrary the relationship holds everywhere. This result is 
known as the Bianchi identity and, using the antisymmetry relation (7.14), it may 
be written more succinctly as 


^[e^ab]cd 0 . 


7.11 The Ricci tensor and curvature scalar 

It follows from the symmetry properties (7.14-7.16) of the curvature tensor that 
it possesses only two independent contractions. We may find these by contracting 
either on the first two indices or on the first and last indices respectively. From 
(7.14), raising the index a and then contracting on the first two indices gives 

R a acd = 0 - 

Contracting on the first and last indices, however, gives in general a non-zero 
result and this leads to a new tensor, the Ricci tensor. It is traditional to use the 
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same kernel letter for the Ricci tensor as for the curvature tensor, so we denote 
its components by 

R(ib = R abc- 

By raising the index a in the cyclic identity (7.17) and contracting with d, one 
may easily show that the Ricci tensor is symmetric. Thus we have R a b = R a b and 
we can denote both by R h a . 

A further contraction gives the curvature scalar (or Ricci scalar ) 

R = g ab R ab = R a a , 

where again the same kernel letter is used. This is a scalar quantity defined at 
each point of the manifold. 

The covariant derivatives of the Ricci tensor and the curvature scalar obey a 
particularly important relation, which will be central to our development of the 
field equations of general relativity. Raising a in the Bianchi identity (7.18) and 
contracting with d gives 

^e^bc + V c R‘'bae + ^bee = O' 

which, on using the antisymmetry property (7.16) in the second term, gives 

^e^bc ~ ^be + ^bee = 0- 

If we now raise h and contract with e, we find 

V b R b -\R + V a R ab bc = 0. (7.19) 

Using the antisymmetry properties (7.14, 7.15) we may write the third term as 

V a R ab bc = y (l R ba cb = V a K = V h R b c , 

so the first and last terms in (7.19) arc identical and we obtain 

2 V b R b ~ V,R = V b (2 R b - 8 b R ) = 0. 

Finally, raising the index c, we obtain the important result 
V b (R bc -\g bc R)= 0. 

The term in parentheses is called the Einstein tensor 

G ab = R ab _\_ g ab R 
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It is clearly symmetric and thus possesses only one independent divergence V a G ab , 
which vanishes (by construction). As we will see, it is this tensor that describes 
the curvature of spacetime in the field equations of general relativity. 


7.12 Curvature and parallel transport 

In Chapter 3, we remarked that parallel transport in a curved manifold was path 
dependent. We now have a more formal description of curvature. If a region 
of manifold is flat then the curvature tensor vanishes throughout the region; 
otherwise, it is curved. Thus there must be some link between the curvature tensor 
and parallel transport. 

Let us consider the parallel transport of a vector v around a closed curve C 
in a manifold. We can define an arbitrary surface A bounding the curve C and 
break this surface up into a lot of small areas each bounded by closed curves 
C N , as indicated in Figure 7.2. The change in the components v a on being 
parallel-transported around the closed curve C is then 

Au* = E(Ai/%, 

N 

where (Av a ) N is the change in v a around the small closed curve C N . This follows 
because the changes in AiF around any of the interior closed curves cancel, 
leaving just the contributions around the outer edges that bound the curve C. 

Let us now calculate (Av a ) N around the small closed curve C N defined by the 
parametric equations x a (u). The equation for parallel transport is given by (3.41): 

dv a b dx c 

du bc du 



Figure 7.2 An arbitrary surface A bounding a closed curve C. 
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Thus, if v a is parallel-transported along the small closed curve Cn from some 
initial point P then at some other point along this curve we have 

v a (u ) = v a p - / T a bc v b — du. (7.20) 

J Up uU 

However, since the closed curve is small we can expand the factors in the integrand 
about P to first order in x a — x a p \ 

r a bci u ) — (T a bc)p +(d<F a bc)p [x d (u) —Xp\ H-> 

V a (u) = v a P - ( T a bc )p v b p [x c (u) - x c p] + ■ ■ ■ . 

Substituting these expressions into (7.20) and retaining terms only up to first order 
in x a — Xp, we obtain 

V a (u) = v a p- (r a bc )p v b p / — du 
J Up uU 

p U // X C 

- (■ 9 d T a bc ~ T a ec T e bd)p Vp J (x d - 4 ) ^7 du. 

If we integrate the coordinate differentials around a closed loop we have f dx c = 0, 
and so we find that 

An* = - (d d T a bc - T a ec T e bc,)p 4 & " dx c . 

J Up 

We may obtain an analogous result by interchanging the dummy indices c and d. 
Now using the result 

d(x c x d ) = j> ( x c clx d + x d dx c ) = 0, 

we find that 

At/ 1 = - \ {d c v\ d - d d T\ c + T a ec r bd - T a ed r hc ) p v b p $ x c dx d . 

On using the expression (7.13), we finally obtain 

(7.21) 

Equation (7.21) establishes the link between the curvature tensor at a point 
P and parallel transport around a small loop close to P. It tells us that the 
components v a will remain unchanged after parallel transportation around a small 
closed loop near P if and only if the curvature tensor vanishes at P. So, returning 
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Figure 7.3 Parallel transport around a closed curve on the surface of a sphere 
and the surface of a cylinder. 


to our construction of (Av a ) N , the vector components v a will not change on 
parallel transportation around the entire closed curve C if the curvature tensor 
R a bed vanishes over the entire area A bounding the curve. 

As an example, consider the parallel transportation of a vector around the 
closed triangle ABC on the surface of a sphere (see Figure 7.3). As shown in 
Appendix 7A, the curvature tensor is nowhere zero, and it is evident that the 
vector changes direction after parallel transportation around the triangle. However, 
as also mentioned in Appendix 7A, the curvature tensor vanishes everywhere 
on the surface of a cylinder and hence the components of a vector will remain 
unchanged if the vector is parallel-transported around any closed curve (see 
Figure 7.3). 


7.13 Curvature and geodesic deviation 

Another important consequence of curvature is that two nearby geodesics that 
are initially parallel either converge or diverge, depending on the local curvature. 
This is embodied in the equation of geodesic deviation , which we now derive. 

Consider two neighbouring geodesics, C given by x a {u) and C given by x a (u), 
where u is an affine parameter, and let £ a (u) be the small ‘vector’ connecting 
points on the two geodesics with the same parameter value (see Figure 7.4), i.e. 

x a (u) = x a {u) + Z a (u). 

In particular, let us suppose that for some arbitrary value of u the vector P‘(u) 
connects the point P on C to the point Q on C. 

Once again our derivation is simplified considerably by constructing local 
geodesic coordinates about the point P, in which the connection coefficients 
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vanish at P but their derivatives arc in general non-zero there. In this coordinate 
system, since G and G arc geodesics we have 


/ d 2 x a ' 


(7.22) 

\ du 2 , 

1 =°’ 

dx b dx c N 
du du , 

) =°- 
' Q 

(7.23) 


d 2 x a 
du 2 ~*~ 

at the points P and Q respectively. However, to first order in 

r a bc (Q) = r a bc (P) + {d d T a bc ) P Z d = {d d v a bc ) P i d . 

Thus, subtracting (7.22) from (7.23) gives, to first order, at P 
'i a + {d d T a bc )x b x^ d = 0, 


where the dots denote d/du. However, in our geodesic coordinates the second- 
order intrinsic derivative of at P is given by 

^& = Tu + V ) = ^ + ^bc) Z b * C X d , 

where we have used the fact that Y a hr (P) = 0; we note that nevertheless the 
derivatives of Y a bc at P may not vanish. Thus, combining the last two equations 
and relabelling dummy indices, we find that at P 

+ (d b T a cd — d d T a bc ) + £ b x c x d = 0. 

We may now identify the terms in parentheses on the left-hand side as components 
R a cbd °f the Riemann tensor when expressed in local geodesic coordinates about 
P. Thus we may write the above result as 

= 0 , 


( 7 . 24 ) 
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Figure 7.5 Converging geodesics on the surface of a sphere. 

which is clearly a tensor relation and is hence valid in any coordinate system. 
Moreover, since P is an arbitrary point on G. this relation is valid everywhere 
along the curve. The result (7.24) is the equation of geodesic deviation. 

The geometric meaning of (7.24) is straightforward. In a flat region of a mani¬ 
fold, R a bcd = 0 and we may adopts Cartesian coordinates throughout. In this case, 
D/Du = d/du and the equation of geodesic deviation reduces to d 2 f a /du 2 = 0, 
which implies that f a (u ) = A“it + B" where A“ and B a are constants. So in a 
flat region the separation vector C'(u) connecting the two geodesics (which are 
simple straight lines in this case) in general increases linearly with u. In the 
special case where the two lines are initially parallel then they will remain so 
and hence never intersect. In a curved region of a manifold, however R a bc ,i 7^ 0 
and so neighbouring geodesics either converge or diverge. For example, the two 
neighbouring geodesics AB and AC on the surface of a sphere (see Figure 7.5) 
converge as we approach the point A at the pole because the surface is positively 
curved. Equation (7.24) allows us to compute the rates of convergence or diver¬ 
gence of neighbouring geodesics for Riemannian spaces of arbitrary complexity. 
All one needs to do is to compute the curvature tensor (7.13) at each point using 
the metric. 


7.14 Tidal forces in a curved spacetime 

Now that we have derived the equation of geodesic variation (7.24), we can 
give a more quantitative account of the gravitational tidal forces mentioned 
in our discussion of the equivalence principle in Section 7.2. Let us begin by 
working in Newtonian gravity and consider an initially spherical distribution of 
non-interacting particles freely falling towards the Earth (see Figure 7.6). Each 
particle moves on a straight line through the centre of the Earth, but those nearer 
the Earth fall faster because the gravitational attraction is stronger. Thus the 
sphere no longer remains a sphere but is distorted into an ellipsoid of the same 
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Figure 7.6 Tidal force on a collection of non-interacting particles. 


volume: gravity has produced a tidal force in the sphere of particles that results 
in an elongation of the distribution in the direction of motion and a compression 
of the distribution in the transverse directions. Indeed, it is straightforward to 
show that, for two nearby particles with trajectories x'(t) and x l (t)(i — 1,2,3) 
respectively in Cartesian coordinates, that the components of the separation vector 
= x‘ — x l evolve as 

d 2 C‘ _ ( d 2 d> 

dt 2 \dx l dxi 

where is the Newtonian gravitational potential (see Exercise 7.21). 

A similar tidal effect occurs in general relativity and can be understood in 
terms of the curvature of the spacetime. In particular, we can gain some idea 
of the general-relativistic tidal forces by considering the equation of geodesic 
deviation (7.24). Consider any pair of our non-interacting particles. Each one 
is in free fall and so they must move along the timelike geodesics x ,J -(r) and 
(t) respectively, where t is the proper time experienced by the first particle 
(say). If we define a small separation vector between the two particle worldlines 
by £^(t) = x ,j (t) — x /j -(t), then (7.24) shows that it evolves according to the 
equation 

(7.25) 

where we have defined the tidal stress tensor 

(7.26) 

in which u a = du a /dr is the 4-velocity of the first particle. Note that in defining 
.S' M „ we have made use of the fact that the curvature tensor is antisymmetric in 
its last two indices. The result (7.25) is a fully covariant tensor equation and 
therefore holds in any coordinate system. 
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To understand the physical consequences of the geodesic deviation effect, it is 
helpful to consider how some observer will view the relative spatial acceleration 
of the two particles. Suppose that our observer is sitting on the first particle, 
the worldline x 11 (t) of which passes through some event P. In order to calculate 
the relative spatial acceleration measured by our observer, we may erect a set of 
orthonormal basis vectors e a at P that define the instantaneous rest frame (IRF) 
of the first particle (and the observer) at this event. The timelike basis vector is 
given simply by e {] = u, where u is the 4-velocity at P of the first particle, and 
we may choose the spacelike basis vectors e t in any way, provided that the full 
set satisfies 


3 Va/j- 

In this way, the duals of these basis vectors, which are given by e a = r] a Pep, 
also form an orthonormal set. The general situation is illustrated schematically in 
Figure 7.7. 

The components of the separation vector £ with respect to our new frame are 

these components give the temporal and spatial separations of the events P and 
Q on the two particle worldlines, as measured by our observer. Since the e a (a = 
0, 1,2,3) are the basis vectors of an inertial Cartesian coordinate system at P, 
the intrinsic derivative in this coordinate system is simply equal to the ordinary 



Figure 7.7 Schematic illustration of the basis vectors of the instantaneous rest 
frame at P. A general connecting vector £ and the orthogonal connecting vector 
£ are also shown. 
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derivative. Moreover, with respect to the IRF, the 4-velocity of the first particle 
is simply [n“] = (c, 0). Thus from (7.25) we have 


d 2 e 

dr 2 


— r 2 £ 

~ c A OOyS 


(7.27) 


where the components of the curvature tensor in the Cartesian inertial frame at P 
may be written as 


Rh m = R ^ P (e a ),Ce p ne y ne s y. (7.28) 

Equation (7.27) in fact holds for any orthonormal freely falling frame e a . 

Clearly, the general separation vector ^ is inappropriate for our discussion of 
the evolution of the spatial separation seen by our observer at P, since typically 
£ will have some temporal component in the observer’s frame. Thus, we must 
work instead with the orthogonal connecting vector £(t) shown in Figure 7.7, 
which has a zero component in the e 0 -direction, i.e. 4 ° = 0. Since (7.27) is valid 
for any small connecting vector it must also hold for the orthogonal connecting 
vector £, but we must remember that 4 0 (r) = 0 for all r. 

A useful alternative interpretation of (7.25) or (7.27) is that it gives the force 
per unit mass required to keep two particles moving along parallel curves; this 
force must be supplied by some mechanical means. For example, the worldline 
of the centre of mass of a rigid body in free fall is a timelike geodesic, but this 
is not true of the other parts of the object, which arc constrained to move along 
curves parallel to the centre of mass rather than along neighbouring geodesics. 
The necessary forces must be supplied by internal stresses in the object. The 
physical magnitude of the stresses is most easily found by solving the eigenvalue 
problem 

S^y = AtA (7.29) 


where .S' M „ is given by (7.26). One of the eigenvalues is always zero (for v 11 = u ,x ), 
and the remaining three eigenvalues give the principal stresses in the object. 


Appendix 7A: The surface of a sphere 

The metric 5 of the surface of a sphere in spherical polar coordinates is 

ds 2 = a 2 dd 2 + a 2 sin 2 ddtfr. 


5 Note that this term is often applied, as here, to the line element itself. 
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To get used to handling problems involving curved spaces you should calculate 
the components of the affine connection, starting from this metric. The definition 
of the affine connection is 


r a be — \g ad (dbSdc + dcSbd ~ ddSbc)’ 

as given in (3.21), and in two dimensions there are six independent connection 
coefficients, 


F 1 

1 12 ’ 


pi 

1 22 ’ 


r 2 

1 22 - 


These coefficients arc given by (exercise): 

rl n = ^“(Agii + dign-digii^o, 

r 1 12 = \ 8 U (^ 28 ll + ^2 #21 — ^iSn) = 0’ 


r 22 —78 (^2821 + ^2821 ~ ^ 1822 ) — ~i 8 d[g 22 


_ 1 „11 
22-2 

2 _ 1 „22 
2 


r Z 12 = ^8~~(d\g22 + ^ 28 l 2 ~ ^ 28 ll) = \g- 2 d\g22-’ 

r- 22 = 3^ 22 (^2?22 + ^2822 — ^2822) = 0. 

So, the only two non-zero coefficients are 


pi _ 

1 22 — 


r 2 

1 


1 , 

- - 2 a sin 0cos 8 = — sin 0cos 9, 

2 a- 

1 2 cos $ 

-— 2r/'sin0cos0 =-. 

2 a 1 sin 2 d sin 6 


The curvature tensor is 


R abcd = \( d d d a 8 bc ~ d d d b 8 ac + d c d b 8 ad ~ d c d a 8 bd ) “ gefi^ac^bd ~ ^ad^ be) 

and in two dimensions the symmetry properties of this tensor mean that there is 
only one independent component. We can take this to be R n i 2 > so fortunately we 
only have to calculate this single component: 

^1212 = 5 (^ 2^1 §21 — ^ 2.811 + ^ 1^2812 — ^ 822 ) — 8 ef(T L 22 ~ ^\2^ 2l) 

— 5^1 ^22 ^n(r 1 iir 1 2 2 r 1 12 r 1 12 )-g22(r 2 nr 2 2 2 — r 2 21 r 2 2 i) 

= a 2 sin 2 6 . 


Thus the Gaussian curvature K of a spherical surface is given by 

R 1212 fl 2 sin 2 0 1 

8 a 4 sin 2 6 a 2 


K = 
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Instead of a spherical surface, we could instead consider the surface of a 
cylinder of radius a. The metric of the surface in cylindrical polar coordinates is 

ds 1 = a 2 dd 2 + dz 2 , 

and it is obvious that this two-dimensional space is spatially flat because we can 
transform the metric into the form 

ds 2 = dx 2 + dz 2 

by the coordinate transformation x = ad. It therefore follows that the curvature 
of a cylindrical surface vanishes. 


Exercises 

7.1 From Poisson’s equation V 2 T> = AirGp show that the gravitational potential outside 
a spherical object of mass M at a radial distance r from its centre is given by 
<E>(r) = —GM/r. What is the form of <l>(r) inside a uniform spherical body? 

7.2 A charged object held stationary in a laboratory on the surface of the Earth does not 
emit electromagnetic radiation. If the object is then dropped so that it is in free fall, it 
will begin to radiate. Reconcile these observations with the principle of equivalence. 
Hint: Consider the spatial extent of the electric field of the charge. 

7.3 If X 11 is a local Cartesian coordinate system at some event P, show that so too is the 
coordinate system X' p ' = AX V X V , where defines a Lorentz transformation. 

7.4 If two vectors v and w are Fermi-Walker-transported along some observer’s world¬ 
line, show that their scalar product v ■ w is preserved at all points along the line. 

7.5 Photons of frequency v E are emitted from the surface of the Sun and observed by an 
astronaut with fixed spatial coordinates at a large distance away. Obtain an expression 
for the frequency v 0 of the photons as measured by the astronaut. Hence estimate 
the observed redshift of the photon. 

7.6 An experimenter A drops a pebble of rest mass m in a uniform gravitational field g. 
At a distance h below A, experimenter B converts the pebble (with no energy loss) 
into a photon of frequency v B . The photon passes by A, who observes it to have 
frequency v A . Use simple physical arguments to show that to a first approximation 

v R gh 

— = i+V 

v A c 2 

Use this result to argue that for two stationary observers A and B in a weak gravi¬ 
tational field with potential <I>, the ratio of the rates at which their laboratory clocks 
run is 1 + A<J>/c 2 , where Ach is the potential difference between A and B. 

7.7 A satellite is in circular polar orbit of radius r around the Earth (radius R, mass M). 
A standard clock C on the satellite is compared with an identical clock C 0 at the 
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south pole on Earth. Show that the ratio of the rate of the orbiting clock to that of 
the clock on Earth is approximately 

GM 3 GM 
Rc 2 2 rc 2 

Note that the orbiting clock is faster only if r > / R. i.e. if r — R > 3184km. 

7.8 Consider the limit of a weak gravitational field in a coordinate system in which 

gfiv — + hpv, with |/ V | <sc 1, and = 0. Keeping only terms that are first 

order in v/c, show that the equation of motion for a slowly moving test particle 
takes the form 

^ * - I c^djhoo + c8 ik (djh ok - d k h 0j ) i/. 

Give a physical interpretation of the second term on the right-hand side. 

7.9 Show that in a two-dimensional Riemannian manifold all the components of R abcd 
are equal either to zero or to ±R 1212 . 

7.10 Show that the line element els 2 = y 2 dx 2 + x 2 dy 2 represents the Euclidean plane, but 
the line element ds 2 = y dx 2 + x dy 2 represents a curved two-dimensional manifold. 

7.11 For a two-dimensional manifold with line element ds 2 = dr 2 +f 2 (r) d0 2 , show that 
the Gaussian curvature is given by K — —/"//> where a prime denotes d/dr. 

7.12 By calculating the components of the curvature tensor R d abc in each case, show that 
the line element 

ds 2 — —^—- dr 2 + r 2 dO 2 + r 2 sin 2 Ode/) 2 
a - — r 1 

represents a curved three-dimensional manifold. Show that the manifold is flat in 
the limit a —> 0 . 

7.13 A spacetime has the metric 

ds 2 — c 2 dt 2 — a 2 (t)(dx 2 + dy 2 + dz 2 ). 

Show that the only non-zero connection coefficients are are 

r ii — r 22 = r 33 = afl and r 10 = r - 20 = r 30 = a/a. 

Deduce that particles may be at rest in such a spacetime and that for such particles 
the coordinate t is their proper time. Show further that the non-zero components of 
the Ricci tensor are 

R 00 — 3 ci /ci and R | j — Ri 2 — ^33 — tr ci 2 ci . 

Hence show that the 00-component of the Einstein tensor is G 00 = —3a 2 /a 2 . 

7.14 Show that the covariant components of the curvature tensor are given by 

Robed = j( d d d agbc - d d d b g ac + d c d bgad - d c d a g bd ) - g ef (r eac r fbd - r ead r fbc ), 

and hence verify its symmetries. Show further that, for an /V-dimensional manifold, 
the number of independent components is N 2 (N 2 — 1)/12. 
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7.15 Show that for any two-dimensional manifold the covariant curvature tensor has the form 

Rabed SacSbd SadSbc )’ 

where the scalar K may be a function of the coordinates. Why does this result not 
generalise to arbitrary manifolds of higher dimension? 

7.16 If v a are the contravariant components of a vector and T ab are the contravariant 
components of a rank-2 tensor, prove the results 

v c v b v a - v b vy = - R a db y , 

V d V c T ab - V c V d T ab = -R\ cd T b - R b ecd T ae . 

Can you guess the corresponding result for the mixed components T ab c of a rank-3 
tensor? 

7.17 Show that any Killing vector v a , as defined in Exercise 4.11, satisfies the relations 

^c^b V “ = R 0 bccfi* 1 > 

v a V a R = 0. 


7.18 Calculate explicit forms for the Ricci tensor R ab and the Ricci scalar R in terms of 
the metric, the connection and its partial derivatives. 

7.19 Prove that the Ricci tensor R ab is symmetric. 

7.20 A conformal transformation, such as that in Exercise 2.7, is not a change of 
coordinates but an actual change in the geometry of a manifold such that the metric 
tensor transforms as 

gab(x) = Tl 2 {x)g ab (x), 

where fl(.r) is some non-vanishing scalar function of position. Show that, under 
such a transformation, the metric connection transforms as 

r V = rv + ^ (W+ 8 ° b d c n - 8 bc 8 ad d d n). 

Hence show that the curvature tensor, the Ricci tensor and the Ricci scalar transform 
respectively as 

/f a v e v,n 

R“ bed = R“ bed ~ 2 — gblc^g 0 J ^ 

it f a (v e n)(v f n) 

+ 2 (28^81 - 2 gb[c 8' d]g “f + g b[c 8^f) ^ f \ 

r f a v e v f n 

R bc = R bc + [(JV- 2 )8{8‘ c + 8bc g ef \ 


[2(N-2)8 f b 8 e c -{N-3)g bc g ef ~ 


(v ( ,n)(v f H) 
ft 2 ’ 


R ,vv,n f (vn)(v f a) 

R=— + 2(N-l)g ef ^^ + (N-l)(N-4)g ef Ke 


where N is the dimension of the manifold. 
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7.21 Show that parallel transportation of a vector around the closed triangle ABC on the 
surface of a sphere, as shown in Figure 7.3, results in a vector that is orthogonal to 
its original direction. 

7.22 On the surface of a sphere, show that, along the geodesic c/> = constant, the geodesic 
deviation vector C satisfies 

W _ 0 D 2 ^ _ H>( dd \ 

Ds 2 ’ Ds 2 \ds J 

Choose a geodesic cf> = cf> 0 with path length s — 6 measured from 0 = 0, and a 
neighbouring geodesic (j) — cf> 0 + S(f> 0 , also with s— 6, and define C (0) as the vector 
between s = 0 on one and s = 0 on the other. Show that £'(0) — (0, ^ ) for all 0. 
Show in addition that if ^ = 0 when 0 = 0 then 

= 12 sin20 > 

where l 2 is a constant, and that the two geodesics pass through 0 = tt. 

7.23 In Newtonian gravity, consider two nearby particles with trajectories x'(t) and 
x'(t)(i = 1,2, 3) respectively in Cartesian coordinates. Show that the components 
of the separation vector V = x' — x' evolve as 

d 1 £ _ _ / 0 2 <F 
dt 2 \ dx l dx J 

where $ is the Newtonian gravitational potential. 

7.24 In the weak-field, Newtonian, limit of general relativity, we may choose coordinates 

such that g MJ , = rj^ + h^,, where |/i | 1, and we assume that all particle velocities 

are small compared with c. By considering the equation of geodesic deviation, 
show that the general-relativistic tidal force reduces to the Newtonian limit given 
in Exercise 7.23. 
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Let us now follow Einstein’s suggestion that gravity is a manifestation of space- 
time curvature induced by the presence of matter. We must therefore obtain a set 
of equations that describe quantitatively how the curvature of spacetime at any 
event is related to the matter distribution at that event. These will be the gravi¬ 
tational field equations, or Einstein equations, in the same way that the Maxwell 
equations arc the field equations of electromagnetism. 

Maxwell's equations relate the electromagnetic field F at any event to its 
source, the 4-current density j at that event. Similarly, Einstein’s equations relate 
spacetime curvature to its source, the energy-momentum of matter. As we shall 
see, the analogy goes further. In any given coordinate system. Maxwell’s equa¬ 
tions are second-order partial differential equations for the components F jXV of 
the electromagnetic field tensor (or equivalently for the components A p of the 
electromagnetic potential). We shall find that Einstein’s equations are also a set of 
second-order partial differential equations, but instead for the metric coefficients 
g l±v of spacetime. 


8.1 The energy-momentum tensor 

To construct the gravitational field equations, we must first find a properly rela¬ 
tivistic (or covariant ) way of expressing the source term. In other words, we must 
identify a tensor that describes the matter distribution at each event in spacetime. 

We will use our discussion of the 4-current density in Chapter 6 as a guide. Thus, 
let us consider some general time-dependent distribution of (electrically neutral) 
non-interacting particles, each of rest mass m 0 . This is commonly called dust in 
the literature. At each event P in spacetime we can characterise the distribution 
completely by giving the matter density p and 3-velocity u as measured in some 
inertial frame. For simplicity, let us consider the fluid in its instantaneous rest 
frame S at P, in which u — 0. In this frame, the (proper) density is given by 
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Lorentz contracted in 
direction of motion 

Figure 8.1 The Lorentz contraction of a fluid element in the direction of motion. 

p () = m Q n 0 , where m 0 is the rest mass of each particle and n Q is the number of 
particles in a unit volume. In some other frame S', moving with speed v relative 
to S, the volume containing a fixed number of particles is Lorentz contracted 
along the direction of motion (see Figure 8.1). Hence, in S' the number density 
of particles is n' = y v n 0 . We now have an additional effect, however, since the 
mass of each particle in S' is m' = y v m 0 . Thus, the matter density in S' is 



We may conclude that the matter density is not a scalar but does transform as 
the 00-component of a rank-2 tensor. This suggests that the source term in the 
gravitational field equations should be a rank-2 tensor. At each point in spacetime, 
the obvious choice is 

( 8 . 1 ) 

where p 0 (x) is the proper density of the fluid, i.e. that measured by an observer 
comoving with the local flow, and u(x) is its 4-velocity. The tensor T(x) is 
called the energy-momentum tensor (or the stress-energy tensor) of the matter 
distribution. We will see the reason for these names shortly. Note that from now 
on we will denote the proper density simply by p, i.e. without the zero subscript. 
In some arbitrary coordinate system x M , in which the 4-velocity of the fluid is 
the contravariant components of (8.1) are given simply by 

(8.2) 

To give a physical interpretation of the components of the energy-momentum 
tensor, it is convenient to consider a local Cartesian inertial frame at P in which 
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the set of components of the 4-velocity of the fluid is [m m ] = y „(c, u). In this 
frame, writing out the components in full we have 

r 00 = pu°u° = ylpc 1 , 

T°' = T‘° = pu°u' = y upcu 1 , 

T'i = pu’ui = yIpu'if. 

Thus the physical meanings of these components in this frame are as follows: 

r 00 is the energy density of the particles; 

T 0 ' is the energy flux x c _ 1 in the /-direction; 
r° is the momentum density x c in the /-direction; 

T'J is the rate of flow of the /-component of momentum per unit area in the 
/-direction. 

It is because of these identifications that the tensor T is known as the energy- 
momentum or stress-energy tensor. 


8.2 The energy-momentum tensor of a perfect fluid 

To generalise our discussion to real fluids, we have to take account of the facts that 
(i) besides the bulk motion of the fluid, each particle has some random (thermal) 
velocity and (ii) there may be various forces between the particles that contribute 
potential energies to the total. The physical meanings of the components of the 
energy-momentum tensor T give us an insight into how to generalise its form to 
include these properties of real fluids. 

Let us consider T at some event P and work in a local Cartesian inertial frame 
S that is the IRF of the fluid at P. For dust, the only non-zero component is T 00 . 
However, let us consider the components of 7 in the IRF for a real fluid. 

• T 00 is the total energy density, including any potential energy contributions from forces 
between the particles and kinetic energy from their random thermal motions. 

• T°": although there is no bulk motion, energy might be transmitted by heat conduction, 
so this is basically a heat conduction term in the IRF. 

• T ‘°: again, although the particles have no bulk motion, if heat is being conducted then 
the energy will carry momentum. 

• T ,J : the random thermal motions of the particles will give rise to momentum flow, 
so that T" is the isotropic pressure in the /-direction and the T l] (with / f j) are the 
viscous stresses in the fluid. 

These identifications are valid for a general fluid. A perfect fluid is defined as 
one for which there arc no forces between the particles, and no heat conduction 
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or viscosity in the IRF. Thus, in the IRF the components of T for a perfect fluid 
are given by 


V 


0 0 0 \ 

p 0 0 

0 p 0 
0 0 p) 


(8.3) 


It is not hai'd to show that 


T llv = (p + p/c i )u lt u v -pr/ lv . (8.4) 


However, because of the way in which we have written this equation, it must 
be valid in any local Cartesian inertial frame at P. Moreover, we can obtain an 
expression that is valid in an arbitrary coordinate system simply by replacing 
rf ,v with the metric functions g ,xv in the arbitrary system. Thus, we arrive at a 
fully covariant expression for the components of the energy-momentum tensor of 
a perfect fluid: 

(8.5) 

We see that T IJ -" is symmetric and is made up from the two scalar fields p and p 
and the vector field u that characterise the perfect fluid. We also see that in the 
limit p —»■ 0 a perfect fluid becomes dust. 

Finally, we note that it is possible to give more complicated expressions repre¬ 
senting the energy-momentum tensors for imperfect fluids, for charged fluids and 
even for the electromagnetic field. These tensors are all symmetric. 



8.3 Conservation of energy and momentum for a perfect fluid 

Let us investigate how to express energy and momentum conservation in a local 
Cartesian inertial frame S at some event P that is represented by the local inertial 
coordinates x 11 . In these coordinates, the energy-momentum tensor takes the 
form (8.4). 

By analogy with the equation 3 = 0 for the conservation of charge, which 

we derived in Chapter 6, the conservation of energy and momentum is represented 
by the equation 

(8.6) 

Rather than arriving at this result from first principles, which would take us into 
a lengthy discussion of relativistic fluid mechanics, let us instead reverse the 
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process and justify our assertion by arguing that it produces the correct equations 
of motion and continuity for a fluid in the Newtonian limit. 

Substituting the form (8.4) into (8.6) gives 

d^P + P/c 2 ) u ^u v + (p + P/c 2 )[(d^)u v + u^(d^u v )]-{d IJL p)tf v = 0. (8.7) 

Now, the 4-velocity satisfies the normalisation condition u v u v = c 2 and differenta- 
tion of this gives 

(d^u v )u v + u v (d^u v ) = 2 (d tl u v )u v = 0. 

Thus, contracting (8.7) with u v , dividing through by c 2 and collecting terms gives 

^(pw^) + (p/c^d^ = 0. (8.8) 

Equation (8.7) therefore simplifies to 

(p + p/c 1 )(d IJL u v )u lx = (rf v - u tJ 'u v /c 2 )d fl p. (8.9) 

Equations (8.8) and (8.9) are, in fact, respectively the relativistic equation of 
continuity and the relativistic equation of motion for a perfect fluid in local inertial 
coordinates at some event P} We will now show that for slowly moving fluids 
and small pressures they reduce to the classical equations of Newtonian theory. 

By a slowly moving fluid, we mean one for which we may neglect u/c and so 
take y u ~ 1 and [ u 11 ] ~ ( c , «); note that the difference between the proper density 
and the density disappears in this limit. By small pressures we mean that p/c 2 is 
negligible compared with p. In these limits, equation (8.8) then reduces to 

<V(p» M ) = 0^ 

or, in 3-vector notation, 

+ V • (pu) = 0, 
at 

which is the classical equation of continuity for a fluid. In the limit of small 
pressures, equation (8.9) reduces to 

P(d fJL u p )u 11 = {rj^ v - r/ / "«7c 2 )d M p. 

Moreover, in our slowly-moving approximation, the zeroth components of the 
left- and right-hand sides arc both zero. Thus the spatial components i = 1,2, 3 
satisfy 

p(d / 1 u')u fl = —Si'djp. 


As usual, these equations may be generalised to a form valid in arbitrary coordinates by replacing by V A 
and replacing rf ,v by . 
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In 3-vector notation this reads 


(- + 3. V J5 = -V„. 


which is Euler’s classical equation of motion for a perfect fluid. Hence we have 
shown that the relativistic continuity equation (8.8) and the equation of motion 
(8.9) for a perfect fluid reduce to the appropriate Newtonian equations. If we were 
to accept that a relativistic fluid were described by (8.8) and (8.9) then we could 
reverse our overall argument and derive the result d^T^" = 0. 

So far we have worked in local inertial coordinates in order to make contact 
with the Newtonian theory. Nevertheless, we can trivially obtain the condition for 
energy and momentum conservation in arbitrary coordinates by replacing by 
in (8.6), which then gives 


V„ T^’ = 0. 


( 8 . 10 ) 


This important equation is worthy of further comment. In our discussion so 
far, we have not been explicit about whether our spacetime is Minkowskian or 
curved. Although the form (8.10) is valid (in arbitrary coordinates) in both cases, 
its interpretation differs in the two cases. If we neglect gravity and assume a 
Minkowski spacetime, the relation (8.10) does indeed represent the conservation 
of energy and momentum. In the presence of a gravitational field (and hence 
a curved spacetime), however, the energy and momentum of the matter alone 
is not conserved. In this case, (8.10) represents the equation of motion of the 
matter under the influence of the gravitational field; this is discussed further in 
Section 8.8. As we will see below, the condition (8.10) places a tight restriction 
on the possible forms that the gravitational field equations may take. 


8.4 The Einstein equations 

We arc now in a position to deduce the form of the gravitational field equations 
proposed by Einstein. Let us begin by recalling some of our previous results. 

• The field equation of Newtonian gravity is 

V 2 <b = 4-TrGp. 

• If gravity is a manifestation of spacetime curvature, we showed in Chapter 7, equa¬ 
tion (7.8), that for a weak gravitational field, in coordinates such that g = tj + h 
(with \h \ « I) and in which the metric is static, then 


goo — 


2<T> 


( 8 . 11 ) 
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• The correct relativistic description of matter is provided by the energy-momentum 
tensor and, for a perfect fluid or dust, in the IRF we have 

^oo = P c ■ 

Combining these observations suggests that, for a weak static gravitational field 
in the low-velocity limit, 

8 ttG 

V-goo — — f~ T oo- 
c 4 

Einstein’s fundamental intuition was that the curvature of spacetime at any 
event is related to the matter content at that event. The above considerations thus 
suggest that the gravitational field equations should be of the form 

K ij,v = kT ,xv, ( 8 - 12 ) 

where K ]XV is a rank-2 tensor related to the curvature of spacetime and we have 
set k = SttG/c 4 . Since the curvature of spacetime is expressed by the curvature 
tensor R jXV(jp , the tensor K jXV must be constructed from R pvap and the metric tensor 
g^. Moreover, K jXV should have the following properties: (i) the Newtonian limit 
suggests that K jX v should contain terms no higher than linear in the second-order 
derivatives of the metric tensor; and (ii) since T /xv is symmetric then K pv should 
also be symmetric. The curvature tensor R jXl „ rp is already linear in the second 
derivatives of the metric, and so the most general form for K p v that satisfies (i) 
and (ii) is 

K llv = aR llv + bRg llv + \g llv , (8.13) 

where R is the Ricci tensor, R is the curvature scalar and a, b, A are constants. 

Let us now consider the constants a, b, A. First, if we require that every term 
in K^ v is lineal - in the second derivatives of g p „ then we see immediately that 
A = 0. We will relax this condition later, but for the moment we therefore have 

ttRjiy + bRg^y. 

To find the constants a and b we recall that the energy-momentum tensor satisfies 
V fX T llv = 0; thus, from (8.10), we also require 

V^ 1 ' = V jX {aR lxv + bRg^ v ) = 0. 

However, in Section 7.11 we showed that 

V fX (R^-^R) = 0, 

and so, remembering that = 0, we obtain 

V^ = (ia + %^V? = 0. 
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The quantity V^R will, in general, be non-zero throughout (a region of) spacetime 
unless the latter is flat and hence there is no gravitational field. Thus we find that 
b = —all, and so the gravitational field equations must take the form 


To fix the constant a, we must compare the weak-field limit of these equations 
with Poisson’s equation in Newtonian gravity. The comparison is presented in the 
next section, where we show that, for consistency with the Newtonian theory, we 
require a = — 1 and so 


(8.14) 


where k = SttG/c 4 . Equation (8.14) constitutes Einstein’s gravitational field 
equations , which form the mathematical basis of the theory of general relativity. 
We note that the left-hand side of (8.14) is simply the Einstein tensor G IJ t ,, defined 
in Chapter 7. 

We can obtain an alternative form of Einstein’s equations by writing (8.14) in 
terms of mixed components, 


RV-\8%R=-KT* t 

and contracting by setting /x = v. We thus find that R = kT, where T = Tjf 
Hence we can write Einstein’s equations (8.14) as 


R nv = - K ( T p,v-k T Snv)- 


(8.15) 


In four-dimensional spacetime g jJ V has 10 independent components and so in 
general relativity we have 10 independent field equations. We may compare this 
with Newtonian gravity, in which there is only one gravitational field equation. 
Furthermore, the Einstein field equations arc non-linear in the whereas Newto¬ 
nian gravity is linear in the field <h. Einstein’s theory thus involves numerous 
non-linear differential equations, and so it should come as no surprise that the 
theory is complicated. 


8.5 The Einstein equations in empty space 

In general, T ]±v contains all forms of energy and momentum. Of course, this 
includes any matter present but if there is also electromagnetic radiation, for 
example, then it too must be included in T /JV (the resulting expression is somewhat 
complicated; see Exercise 8.3). 
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A region of spacetime in which T jJ V = 0 is called empty, and such a region is 
therefore not only devoid of matter but also of radiative energy and momentum. It 
can be seen from (8.15) that the gravitational field equations for empty space are 


R nv = 0 - 


(8.16) 


From this simple equation, we can immediately establish a profound result. 
Consider the number of field equations as a function of the number of space- 
time dimensions; then, for two, three and four dimensions, the numbers of field 
equations and independent components of R^ V(rp arc as shown in the table. 


No. of spacetime dimensions 

2 

3 

4 

No. of field equations 

3 

6 

10 

No. of independent components of R pmi> 

1 

6 

20 


Thus we see that in two or three dimensions the field equations in empty space 
guarantee that the full curvature tensor must vanish. In four dimensions, however, 
we have 10 field equations but 20 independent components of the curvature 
tensor. It is therefore possible to satisfy the field equations in empty space with 
a non-vanishing curvature tensor. Remembering that a non-vanishing curvature 
tensor represents a non-vanishing gravitational field, we conclude that it is only 
in four dimensions or more that gravitational fields can exist in empty space. 


8.6 The weak-field limit of the Einstein equations 

To determine the ‘weak-field’ limit of the Einstein equations our preliminary 
discussion in Section 8.4 suggests that we need only consider their 00-component. 
It is most convenient to use the form (8.15) of the equations, from which we have 

*oo — —x (Too — 5 *koo) ■ (8-17) 

In the weak-field approximation, spacetime is only ‘slightly’ curved and so 
there exist coordinates in which g /lv = i} lxv + h /Jt ,, with |/? /J (1 | 1, and the metric 

is stationary. Hence in this case g 00 ~ 1. Moreover, from the definition of the 
curvature tensor we find that R 00 is given by 

*oo = 3 0 rv - d u rtl oo + rv r % - r^V 

In our coordinate system the \' IX V(T are small, so we can neglect the last two 
terms to first order in /i . Also using the fact that the metric is stationary in our 
coordinate system, we then have 

*oo ~ — df'F'oo- 
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In our discussion of the Newtonian limit in Chapter 7, however; we found that 
h'oo ~ \8'idjh 00 to first order in h jJV , and so 

*00 ~ —\8'idjh 00 . 

Substituting our approximate expressions for g 00 and /? 00 into (8.17), in the 
‘weak-field’ limit we thus have 

\_8 ij djh m ^K(I m -\D. (8.18) 

To proceed further we must assume a form for the matter producing the weak 
gravitational field and for simplicity we consider a perfect fluid. Most classical 
matter distributions have p/c 2 p and so we may in fact take the energy- 
momentum tensor to be that of dust, i.e. 

which gives T = pc 2 . In addition, let us assume that the particles making up the 
fluid have speeds u in our coordinate system that are small compared with c. We 
thus make the approximation y u ~ 1 and hence u 0 ~ c. Therefore equation (8.18) 
reduces to 

djdjh 00 ~ \i<pc 2 . 

We may, however, write S^dplj = V 2 ; furthermore, from (8.11) we have h 00 = 
2 4>/c 2 , where <f> is the gravitational potential. Thus, remembering that k = 
8-7tG/c 4 , we finally obtain 

V 2 d> ~ 47rGp, 

which is Poisson’s equation in Newtonian gravity. This identification verifies our 
earlier assertion that a = — 1 in the derivation of Einstein’s equations (8.14). 


8.7 The cosmological-constant term 

The standard Einstein gravitational field equations arc 

R tcv ~ h^ v R = ~ kT ixv (8-19) 

However, these equations are not unique. In fact, shortly after Einstein derived 
them he proposed a modification known as the cosmological term. 

In deriving the field equations (8.14), we assumed that the tensor K jlv that 
makes up the left-hand side of the field equations, 
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should contain only terms that arc linear in the second-order derivatives of g j±v . 
This led us to set A = 0 in (8.13), i.e. to discard the term A g^ v in the tensor K v . 
Let us now relax this assumption. 

Recalling that V /2 T ,1V = 0 we still require = 0, but in Section 4.12 we 

showed that 

vr = o. 

Thus, we can add any constant multiple of g jJLV to the left-hand side of (8.19) and 
still obtain a consistent set of field equations. It is usual to denote this multiple 
by A, so that the field equations become 

( 8 . 20 ) 

where A is some new universal constant of nature known as the cosmological 
constant. By writing this equation in terms of the mixed components and contract¬ 
ing, as we did with the standard field equations, we find that R = kT + 4A. 
Substituting this expression into (8.20), we obtain an alternative form of the field 
equations, 

( 8 . 21 ) 

Following the procedure presented in Section 8.6, it is straightforward to show 
that, in the weak-field limit, the field equation of ‘Newtonian’ gravity becomes 

V 2 d> = 4irGp — Ac 2 . 




For a spherical mass M, the gravitational field strength is easily found to be 

GM c- c 2 Ar- 

g = -Vd>=-4--— r. 

r z 3 


Thus, in this case, we see that the cosmological constant term corresponds to a 
gravitational repulsion whose strength increases linearly with r. 

The reason for calling A the cosmological constant is historical. Einstein first 
introduced this term because he was unable to construct static models of the 
universe from his standard field equations (8.19). What he found (and we will 
discuss this in detail in Chapter 15) was that the standard field equations predicted 
a universe that was either expanding or contracting. Einstein did this work in about 
1916, when people thought that our Milky Way Galaxy represented the whole 
universe, which Einstein represented as a uniform distribution of ‘fixed stars’. 
By introducing A, Einstein constructed static models of the universe (which as 
we will see arc actually unstable). It was later realised, however, that the Milky 
Way is just one of a great many galaxies. Moreover, in 1929 Edwin Hubble 
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discovered the expansion of the universe by measuring distances and redshifts to 
nearby external galaxies. The universe was proved to be expanding and the need 
for a cosmological constant disappeared. Einstein is reputed to have said that the 
introduction of the cosmological constant was his ‘biggest blunder’. 

Nowadays we have a rather different view of the cosmological constant. Recall 
that the energy-momentum tensor of a perfect fluid is 

T^ v = (p + p/c 2 )u^u v - pg^. 

Imagine some type of ‘substance’ with a strange equation of state p = —pc 2 . This 
is unlike any kind of substance that you have ever encountered because it has a 
negative pressure! The energy-momentum tensor for this substance would be 



There arc two points to note about this equation. First, the energy-momentum 
tensor of this strange substance depends only on the metric tensor - it is therefore 
a property of the vacuum itself and we can call p the energy density of the 
vacuum. Second, the form of T' v is the same as the cosmological-constant term 
in (8.20). We can therefore view the cosmological constant as a universal constant 
that fixes the energy density of the vacuum, 

( 8 . 22 ) 

Denoting the energy-momentum tensor of the vacuum by T™ c = p vac c 2 g fJLl ,, we 
can thus write the modified gravitational field equations (8.20) as 

R IXv — ^S/j.v R = ~ K ’ 

where 7 1 is the energy-momentum tensor of any matter or radiation present. 

How can we calculate the energy density of the vacuum? This is one of the 
major unsolved problems in physics. The simplest calculation involves summing 
the quantum mechanical zero-point energies of all the fields known in Nature. 
This gives an answer about 120 orders of magnitude higher than the upper limits 
on A set by cosmological observations. This is probably the worst theoretical 
prediction in the history of physics! Nobody knows how to make sense of this 
result. Some physical mechanism must exist that makes the cosmological constant 
very small. 

Some physicists have thought that A mechanism must exist that makes A exactly 
equal to zero. But in the last few years there has been increasing evidence that 
the cosmological constant is small but non-zero. The strongest evidence comes 
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from observations of distant Type la supernovae that indicate that the expansion of 
the universe is actually accelerating rather than decelerating. Normally, one would 
have thought that the gravity of matter in the universe would cause the expansion 
to slow down (perhaps even eventually halting the expansion and causing the 
universe to collapse). But if the cosmological constant is non-zero, the negative 
pressure of the vacuum can cause the universe to accelerate. 

Whether these supernova observations arc right or not is an area of active 
research, and the theoretical problem of explaining the value of the cosmological 
constant is one of the great challenges of theoretical physics. It is most likely 
that we require a fully developed theory of quantum gravity (perhaps superstring 
theory) before we can understand A. 


8.8 Geodesic motion from the Einstein equations 

The Einstein equations give a quantitative description of how the energy- 
momentum distribution of matter (or other fields) at any event determines the 
spacetime curvature at that event. We also know that, under the influence of 
gravity alone, matter moves along geodesics in the curved spacetime. We now 
show that it is, in fact, unnecessary to make the separate postulate of geodesic 
motion, since it follows directly from the Einstein equations themselves. 

The field equations were derived partly from the requirement that the covariant 
divergence of the energy-momentum tensor vanishes, 

^7^ = 0. (8.23) 

As noted in Section 8.3, this relation represents the equation of motion for matter 
in the curved spacetime, and in this section we explore this interpretation in more 
detail. For later convenience, we may also write (8.23) as 

= -^=3 M (V=gT^) + (8.24) 

where in the last line we have used the expression (3.26) for the contracted 
connection coefficient \' I1 ITI1 , and we note that |g| = — g for a spacetime metric 
with signature —2. 

Let us first consider directly the specific case of a single test particle of rest 
mass m. By analogy with (8.2), the energy-momentum tensor of the particle as a 
function of position x may be written as 

p»w = -£= 

y/—g J dr dr 


(8.25) 
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where zf 1 ( t) is the worldline of the particle and t is its proper time. 2 Inserting 
(8.25) into (8.23) and using the result (8.24), we obtain 

/ ^^^ 54 ( x_z ( T ))^ T + r %/ ^z a 8 A {x-z{T))dT = 0, (8.26) 

where the dots denote differentiation with respect to r. Since 5 4 (x — z(r)) depends 
only on the difference x p — z p , we can replace d/dx p by —d/dz/ 1 where it acts 
upon the delta function. Then, by noting that 

= ^S 4 (x-z(t)), 

we may write (8.26) as 

-J z v -^-8 4 (x-z(T))dT + T v a . fJi J z p 'z <r 8 4 (x - z(t)) dr = 0. 

On performing the first integral on the left-hand side by parts and collecting 
together terms, this becomes 

/ (r + T\^n8 4 (x-z(r))dr = 0. 

For this integral to vanish, we clearly require the first factor in the integrand to 
equal zero, from which we recover directly the standard geodesic equation of 
motion. 

The derivation above offers an entirely new insight into the equation of motion. 
The position of the particle is where the field equations become singular, but 
the solution of the field equations in the empty space surrounding the singularity 
determines how it should move, i.e. it obeys the same equation of motion as that of 
a ‘test particle’. The fact that the Einstein equations predict the equation of motion 
is remarkable and should be contrasted with the situation in electrodynamics. In the 
latter case, the Maxwell equations for the electromagnetic field do not contain the 
corresponding equation of motion for a charged particle, which has to be postulated 
separately. The origin of this distinction between gravity and electromagnetism 
lies in the non-linear nature of the Einstein equations. The physical reason for 
this non-linearity is that the gravitational field itself carries energy-momentum 
and can therefore act as its own source, whereas electromagnetic field carries no 
charge and so cannot act as its own source. 


2 The four-dimensional delta function S 4 (x — y) is defined by the relation 

J <J>(a:)5 4 (x — y) d A x = <l>(y), 

where <I> is any scalar field. Since f—g d 4 x is the invariant volume element, it follows that S 4 (x — y)/f—g 
is the invariant scalar that must be used in (8.25). 
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It is worthwhile generalising the above discussion from a single point particle 
to a continuous matter distribution. As a simple example, we shall consider 
a distribution of dust (i.e. a pressureless perfect fluid), for which the energy- 
momentum tensor is given by 

V LV = pu*u v . 


In this case, the equation of motion (8.23) thus reads 

V lx (pu lx u v ) = V fl (pu fl )u v + pu fl V fl u 1 ’ = 0. (8.27) 

Contracting this expression with u v , we have 

c 2 V ll (pu' x ) + pu lx u v V tl u v = 0, (8.28) 

where we have used the fact that u v ii v = c 2 . Using this result again, one finds 
that u v V fJL u v = 0, and so the second term in (8.28) vanishes. Thus, we obtain 

V>^) = 0, 

which is simply the general-relativistic conservation equation. Substituting this 
expression back into (8.27) gives 


= 0 , 


(8.29) 


which is the equation of motion for the dust distribution in a gravitational field. 
Moreover, let us consider the worldline x 11 ( t) of a dust particle. From (3.38) the 
intrinsic derivative of the particle’s 4-velocity u' 1 along the worldline is given by 


Du v 

Dt 


(VXK = 0, 


where we have used (8.29) to obtain the last equality. Since the intrinsic derivative 
of the 4-velocity (i.e. the tangent vector to the worldline) is zero, the dust particle’s 
worldline x 11 (r) is a geodesic. We can show this explicitly using the expression 
(3.37) for the intrinsic derivative, from which we immediately obtain the geodesic 
equation 

x" + r^r = o. 


8.9 Concluding remarks 

We have now completed the task commenced in Chapter 1 of formulating a 
consistent relativistic theory of gravity. This has led us to the interpretation of 
gravity as a manifestation of spacetime curvature induced by the presence of 
matter (and other fields). This principle is embodied mathematically in the Einstein 
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field equations (8.20). In the remainder of this book, apart from the final chapter, 
we will explore the physical consequences of these equations in a wide variety of 
astrophysical and cosmological applications. In the final chapter we will return to 
the formulation of general relativity itself, rederiving the Einstein equations from 
a variational principle. 


Appendix 8A: Alternative relativistic theories of gravity 

In Section 8.7, we described a relatively simple (but theoretically profound) 
modification of the Einstein field equations. This shows that Einstein’s field 
equations arc not unique. It is also worth noting that it is possible to create more 
radically different theories of gravity, as follows. 


Scalar theory of gravity 

The simplest relativistic generalisation of Newtonian gravity is obtained by contin¬ 
uing to represent the gravitational field by the scalar <h. Since matter is described 
relativistically by the energy-momentum tensor T /JV , the only scalar with the 
dimensions of a mass density is Tfi. Thus a consistent scalar relativistic theory of 
gravity is given by the field equation 


□“<D = 


477G 


7-/1 

V' 


(8.30) 


However, this theory must be rejected since, when used with the appropriate 
equation of motion, it predicts a retardation of the perihelion of Mercury, in 
contradiction of observations. Moreover, it does not allow one to couple gravity 
to electromagnetism, since (T EM )^ = 0; in such a theory we could have neither 
gravitational redshift nor the deflection of light by matter. 


Brans-Dicke theory 

A gravitational theory based on a vector field can be eliminated since such a theory 
predicts that two massive particles would repel one another, rather than attract. It is, 
of course, possible to construct relativistic theories of gravity in which combinations 
of the three kinds of field (scalar, vector and tensor) arc used. The most important 
of these alternative theories is Brans-Dicke theory, which we now discuss briefly. 

In deriving the Einstein field equations, we started with the principle of equiv¬ 
alence, which led us to consider gravitation as spacetime curvature, and we found 
a rank-2 tensor theory of gravity that agreed with Newton’s theory in the limit 
of weak gravitational fields and small velocities. Brans and Dicke also took the 
principle of equivalence as a starting point, and thus again described gravity 
in terms of spacetime curvature. However, they set about finding a consistent 
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scalar-tensor theory of gravity. Instead of treating the gravitational constant G as 
a constant of nature. Brans and Dicke introduced a scalar field <:/) that determines 
the strength of G, i.e. the scalar field (f> determines the coupling strength of matter 
to gravity. The key ideas of the theory arc thus: 

• matter, represented by the energy-momentum tensor and a coupling constant 

A fix the scalar field </>; 

• the scalar field <f) fixes the value of G; 

• the gravitational field equations relate the curvature to the energy-momentum tensors 
of the scalar field and matter. 

The coupled equations for the scalar field and the gravitational field in this 
theory arc therefore 



where Tjf v is the energy-momentum tensor of the matter and T$ v is the energy- 
momentum tensor of the scalar field (the form of T$ v is rather complicated). It 
is usual (for historical reasons) to write the coupling constant as A = 2/(3+ 2 co). 
In the limit to — oc we have A — 0, so <J) is not affected by the matter distribution 
and can be set equal to a constant f> = 1 /G. In this case, T^ v vanishes, and hence 
Brans-Dicke theory reduces to Einstein’s theory in the limit oj —»■ oc. 

The Brans-Dicke theory is interesting because it shows that it is possible to 
construct alternative theories that arc consistent with the principle of equivalence. 
Einstein’s theory is beautiful and simple, but it is not unique. One must therefore 
look to experiment to find out which theory is correct. One of the features of 
the Brans-Dicke model is that the effective gravitational ‘constant’ G varies with 
time because it is determined by the scalar field cj). A variation in G would affect 
the orbits of the planets, altering, for example, the dates of solar eclipses (which 
can be checked against historical records). A reasonably conservative conclusion 
from experiments is that to > 500, so Einstein’s theory does seem to be the correct 
theory of gravity, at least at low energies. 

Torsion theories 

Throughout our discussion of curved spacetimes we have assumed that the mani¬ 
fold is torsionless. This is not a requirement, and we can generalise our discussion 
to spacetimes with a non-zero torsion tensor, 
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Typically, torsion is generated by the (quantum-mechanical) spin of particles. 
Such theories are rather complicated mathematically, since we must make the 
distinction between affine and metric connections and geodesics. Gravitational 
theories that include spacetime torsion arc often described as Einstein-Cartan 
theories and have been extensively investigated. We will not discuss these theories 
any further, however. 


Appendix 8B: Sign conventions 

There is no accepted system of sign conventions in general relativity. Different 
books use different sign conventions for the metric tensor, for the curvature tensors 
and for the field equations. We can summarize these sign conventions in terms 
of three sign factors 51, 52 and 53. These arc defined as follows: 

rf v = [51](-1,T1,T1,+1), 

R%p y = [52] (dpT 11 ^ - d y V\ (i + r%r% - T» ay T*p a ) , 

r , 8 t tG 

G^ = [S3]^T^ v , 

V = [52] [S3]R% av . 

In this text we have used a convention that matches that of both R. d’lnverno. 
An Introduction to Einstein’s Relativity, Oxford University Press, 1992, and 
W. Rindler, Relativity: Special, General and Cosmological, Oxford University 
Press, 2001, but this differs from the convention used by, for example, Misner, 
Thorne and Wheeler, Gravitation, Freeman (1973) or Weinberg, Gravitation and 
Cosmology, Wiley, (1972). Here is a summary of the sign conventions used in 
the various books: 



Present text 

d’lnverno, Rindler 

MTW 

Weinberg 

[51] 

- 

- 

+ 

+ 

[52] 

+ 

+ 

+ 

- 

[53] 

- 

- 

+ 

- 


Exercises 

8.1 Show that the components of the energy-momentum tensor of a perfect fluid in its 
instantaneous rest frame can be written as in (8.3): 

T^ v = (p + p/c 2 )iCu v — pi/ 1 ". 

Can the components be written in any other covariant form? 

8.2 Show that, for any fluid. 


zqV^n" = 0. 
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Hence show that a perfect fluid in a gravitational field must satisfy the equations 


v>O + ^vx = 0, 


Obtain the equation of motion for the worldline x' 1 (r) of a particle in a perfect fluid 
with pressure, and hence show that the particle is ‘pushed off geodesics by the 
pressure gradient. 

8.3 The electromagnetic field in vacuo has an energy-momentum tensor T'ff. By analogy 
with the energy-momentum tensor for dust, we require that (i) Tff is symmetric; 
(ii) V 7^ = 0; (iii) Tff must be quadratic in the dynamical variable F M '’. Hence 
show that 

T^ = a (F\F^-\g^F ITp F^), 

where a is a constant. By examining the component 7’® in local Cartesian inertial 
coordinates, show that the constant a — — 1 /p 0 . 

8.4 Consider a cloud of charged dust particles. Show that the equation of motion of such 
a fluid is 


= crF^^iT , 


where p is the proper matter density of the fluid, a is its proper charge density and 
iF is the fluid 4-velocity. Define an energy-momentum tensor Tf" = p« M «", where 
p is the proper density of the fluid. Hence show that 

V m 7T = F\j\ 

where j ,x = aiF is the 4-current density. Thus write down the energy-momentum 
tensor for charged dust, T v,v = Tf" + Tff. 

8.5 The energy-momentum tensor of an electromagnetic field interacting with a source 
satisfies V ;J Tff = —F va J (T , where J a is the 4-current density of the source. Hence 
show that the worldline of a particle of charge q in an electromagnetic field satisfies 

z v + T v ??= -F\z a , 

^ m 

and interpret this result physically. 

8.6 The weak energy condition (WEC) states that any energy-momentum tensor must 
satisfy 

T^t v > 0 

for all timelike vectors C. Show that for a perfect fluid the WEC implies that 
p > 0 and pc 2 + p > 0. 
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7 The strong energy condition (SEC) states that any energy-momentum tensor must 
satisfy 

T t^f > -T p t a t 

Mi’ —2 P & 

for all timelike vectors tC Show that for a perfect fluid the SEC implies that 
pc 2 + p> 0 and pc 2 + 3p > 0. 


Does the SEC imply the WEC in Exercise 8.7? Show further that, from the Einstein 
equations, the SEC implies that R > 0, where R is the Ricci tensor. 

8 The equation-of-state parameter w is defined by w = p/(pc 2 ). If one restricts 
oneself to sources for which p > 0, show that both the weak and strong energy 
conditions in Exercises 8.6 and 8.7 imply that w > — 1. 

9 Write down the form of the energy-momentum tensor for a perfect fluid with 
4-velocity iC with respect to some Cartesian inertial frame S. Show that for the 
energy-momentum tensor to be invariant under a Lorentz transformation to any 
other inertial frame one requires p = —pc 2 . Compare this result with that for the 
energy-momentum tensor of the vacuum. 

10 Find the most general tensor which can be constructed from the curvature tensor 
and the metric tensor and which contains terms no higher than quadratic in the 
second-order derivatives of g . Hence write down the most general form of the 
gravitational field equations in such a theory. 

11 In the Newtonian limit of weak gravitational fields, for a slowly moving perfect fluid 
with pressure p <<. pc 2 show that the 00-component of the Einstein field equations 
with a non-zero cosmological constant A reduces to 

V 2 <h = 477Gp-Ac 2 , 


where V 2 = 8'^djdj and p is the proper density of the fluid. Hence show that the 
corresponding Newtonian gravitational potential of a spherically symmetric mass 
M centred at the origin can be written as 

GM Ac 2 r 2 

d> —- 

r 6 

where r 2 — 8 IJ x l x J . Give a physical interpretation of this result. 

12 In the scalar theory of gravity (8.30) show that, in any inertial frame, the gravitational 
potential <h produced by a perfect fluid at some event P satisfies 


1 3 2 <E> 

c 2 dt 2 


- V 2 d> = -4 ttG 



where p and p are the density and isotropic pressure as measured in the instantaneous 
rest frame of the fluid at P. Hence show that the theory reduces to Newtonian 
gravity in the non-relativistic limit. How might a cosmological constant be included 
in the theory? 
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We now consider how to solve the Einstein field equations and so discover the 
metric functions v in any given physical situation. Clearly, the high degree of 
non-linearity in the field equations means that a general solution for an arbitrary 
matter distribution is analytically intractable. The problem becomes easier if we 
look for special solutions, for example those representing spacetimes possessing 
symmetries. The first exact solution to Einstein’s equations was found by Karl 
Schwarzschild in 1916. 1 As we shall see, the Schwarzschild solution represents 
the spacetime geometry outside a spherically symmetric matter distribution. 

9.1 The general static isotropic metric 

Schwarzschild sought the metric g l±l , representing the static spherically symmetric 
gravitational field in the empty space surrounding some massive spherical object 
such as a star. Thus, a good starting point for us is to construct the most general 
form of the metric for a static spatially isotropic spacetime. 

A static spacetime is one for which some timelike coordinate x° (say) with the 
following properties: (i) all the metric components g /lv are independent of x°; and 
(ii) the line element ds 2 is invariant under the transformation x° —> — x°. Note that 
(i) does not necessarily imply (ii), as is made clear by the example of a rotating 
star: time reversal changes the sense of rotation, but the metric components arc 
constant in time. A spacetime that satisfies (i) but not (ii) is called stationary. 

Thus, starting from the general expression for the line element 

ds 2 = g^ v dx^ dx v , 

we wish to find a set of coordinates in which the g do not depend on the 
timelike coordinate x° and the line element ds 2 is invariant under x° —— jc°, i.e. 


1 Astonishingly, Schwarzschild derived the solution while in the trenches on the Eastern Front during the First 
World War but sadly he did not survive the conflict. 
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the metric is static, and in which ds 2 depends only on rotational invariants of the 
spacelike coordinates x‘ and their differentials, i.e. the metric is isotropic. 

In fact, it is only slightly more complicated to derive the general form of the 
spatially isotropic metric without insisting that it is static. We therefore begin by 
constructing this more general metric. Only after its derivation will we impose 
the additional constraint that the metric is static. 

The only rotational invariants of the spacelike coordinates x 1 and their differ¬ 
entials arc 


x-x = r 2 , dx-dx, x-dx, 

where x = (jt, x 2 , x 3 ) and we have defined the coordinate r. Denoting the tinrelike 
coordinate x° by t, we thus find that the most general form of a spatially isotropic 
metric must be 

ds 2 = A(t, r) dt 2 — B(t, r) dt x ■ dx — C(t, r)(x ■ dx) 2 — D(t, r) dx 2 , (9.1) 

where A, B, C and D arc arbitrary functions of the coordinates t and r. 

Let us now transform to the (spherical polar) coordinates (t, r, 6, </>), defined by 

x l = rsindcostj), x 2 = r sin 0 sin 0, x 3 = rcos0. 

In this case, we have 

x-x—r 2 , x ■ dx = r dr, dx ■ dx — dr 2 + r 2 dd 2 + r 2 sin 2 9 dcf) 2 , 

and so the general metric (9.1) now takes the form 

ds 2 = A(t, r) dt 2 — B(t, r)rdtdr— C{t, r)r 2 dr 2 
— D(t, r) (dr 2 + r 2 dd 2 + r 2 sin 2 9d(f> 2 ). 

Collecting together terms and absorbing factors of r into our functions, thereby 
redefining A, B, C, D, the metric can be written 

ds 2 = A(t, r) dt 2 — B(t, r) dt dr — C(t, r) dr 2 — D(t, r)(dd 2 + sin 2 ddtjr). 

If we now define a new radial coordinate by r 2 = D(t, r) and collect together 
terms into new arbitrary functions of t and r, thereby again redefining A, B, C, 
we can write the metric as 


ds 2 = A{t, r) dt 2 — B{t, r) dt dr — C(t, r) dr 2 — r 2 (dd 2 + sin 2 ddcj) 2 ). (9.2) 
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Let us also introduce a new timelike coordinate 1 defined by the relation 
dt = d>(r, r) \A{t, r) dt — \B(t, r) dr ), 

where <L(r, r) is an integrating factor that makes the right-hand side an exact 
differential. Squaring, we obtain 

dt 2 = <b 2 (A 2 dt 2 — AB dt dr + |S 2 dr 2 ), 

from which we find 

Adt 2 — B dt dr =-r- dt 2 - dr 2 . 

A<b 2 4A 

Thus defining the new functions A = l/(Ad> 2 ) and B = C + B/(4A), our metric 
(9.2) becomes diagonal and takes the form 

ds 2 = A(t, r) dt 2 — B(t, r) dr 2 — r 2 (dd 2 + sin 2 Odtjf). 


There is no need to retain the bars on the variables, so we can write the metric as 


ds 2 = A(t, r) dt 2 — B(t, r) dr 2 — r 2 (dd 2 -f sin 2 ddcj) 2 ). 


(9.3) 


Thus, the general isotropic metric is specified by two functions of t and r, namely 
A{t, r) and B(t, r). We will also see that, for surfaces given by t, r constant, 
the line element (9.3) describes the geometry of 2-spheres, which expresses the 
isotropy of the metric. In fact this line element shows that such a surface has a 
surface area 47rr 2 . However, because B(t, r) is not necessarily equal to unity we 
cannot assume that r is the radial distance. 

The final step in obtaining the most general stationary isotropic metric is now 
trivial. We require the metric functions g^ v to be independent of the timelike 
coordinate, which means simply that A and B must be functions only of r. Thus, 
we have 


ds 2 = A{r) dt 2 — B(r) dr 2 — r 2 (dd 2 + sin 2 ddcj) 2 ). 


(9.4) 


Moreover, we see immediately that ds 2 is already invariant under t —> — t, and 
so this is the required form of the metric for a general static spatially isotropic 
spacetime. 


9.2 Solution of the empty-space field equations 

The functions A(r) and B(r) in the general static isotropic metric are determined by 
solving the Einstein field equations. We arc interested in the spacetime geometry 
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outside a spherical mass distribution, so we must solve the empty-space field 
equations, which simply require the Ricci tensor to vanish: 

v = °- (9-5) 

From equation (7.13) we can write the Ricci tensor as 

V = ^ r V - ^ r V+ r V r V - r V r V> (9.6) 

and, in turn, the connection is defined in terms of the metric g^ v by 

r V = \s <Tp ( d v>g Pf c + d p,8 pv - d pgpv)- (9-7) 

Thus, we see that the deceptively simple expression (9.5) in fact equates to a rather 
complicated set of differential equations for the components of the metric g . 

To proceed further, we must calculate the connection coefficients T"^ corre¬ 
sponding to our static isotropic metric. This can be done in two ways. The quicker 
route (with any metric) is to use the Lagrangian procedure for geodesics discussed 
in Section 3.19. This involves writing down the ‘Lagrangian’ 

L = 

in which iT denotes dx^/dtr, where cr is some affine parameter along the 
geodesic. Subtituting L into the Euler-Lagrange equations then yields the equa¬ 
tions of an affinely parameterised geodesic, from which the connection coeffi¬ 
cients can be identified. Since later we will discuss the motion of particles in the 
Schwarzschild geometry, this procedure would be doubly beneficial. 

For illustration, however, we will adopt the more traditional (but slower) 
method, in which the 1’"^,, arc calculated directly from the metric g j±l , using (9.7). 
Thus we first need to identify the metric components from the line element (9.4). 
The non-zero elements of g /J|; and g^ v arc 

A(r), g°° = l/A(r), 

—fi(r), g n = -l/B{r), 

— r 2 , g 22 = — l/r 2 , 

— r 2 sin 2 0, g 33 = — l/(r 2 sin 2 6), 

where we note that the contravariant components of the metric arc simply the 
reciprocals of the covariant components, since the metric is diagonal. 

Substituting the metric components into the expression (9.7) for the connection, 
we find the expressions given in Table 9.1 (with no sums on Latin indices) with 
all the other components equalling zero. Thus, summarising these results, we find 


goo — 
gn = 
822 = 
833 = 
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Table 9.1 The connection coefficients of the general static isotropic metric 


T'oo = ~\g ip dpgoo 

r ° 0 i = \g° p (dig p o + d og P i - d pgoi ) = ^g°° d igoo 
T° ■ = 0 

IJ 

T'/i = \g ip {dig pi + dtgpi - dpgii) = 
r 1 22 = \g n (d 2gu + d 2gl2 - d lg22 ) 


r J 33 — 2^ U ^1^33 

r 2 21 = 

r 2 33 = -lg 22 d 2 &3 
r 3 31 = k 33 ^ 
r 3 32 = 



1 

dA(r) 

2 B(r) 

dr 

1 

dA(r) 

2 A(r) 

dr 

1 

dB(r) 

2 B(r) 

r 

dr 


B(r) 

r sin 2 6 


B(r) 

1 

r 

— sin 9 cos 6 
1 
r 

cos 9 
sin 9 


that only nine of the 40 independent connection coefficients arc non-zero; they 
read as follows: 


ot — AV(2A), 

T 1 00 = A7(2B), 

T 1 n = B'/(2B), 

22 = ~ r /B, 

T 1 33 = -(rsin 2 9)/B, 

ii 

: 33 = — sin 0 cos 0, 

F 3 i 3 = 1/r, 

•-1 

II 

O 

o 


We now substitute these connection coefficients into the expression (9.6) in 
order to obtain the components R of the Ricci tensor. This requires quite a lot 
of tedious (but simple) algebra. Fortunately the off-diagonal components R /u , for 
are identically zero, and we find that the diagonal components arc 


A" A' (A’ B'\ A' 

R °° ~ ~2B + ^B\A + 1}) ~ rB’ 


R 

R 

R 


= A"_ 

11 2A 4A V 

1 r 

22 = -1 H- 

- 2 B 2 B 

33 = ^22 s i n “ 0- 




B' 

~rB’ 


(9.8) 

(9.9) 

(9.10) 

(9.11) 
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The empty-space field equations (9.5) arc thus obtained by setting each of the 
expressions (9.8-9.11) equal to zero. Of these four equations, only the first three 
are useful, since the fourth merely repeats the information contained in the third. 
Adding B/A times (9.8) to (9.9) and rearranging gives 

A'B + AB' = 0, 

which implies that AB = constant. Let us denote this constant by a. Substituting 
B — a IA into (9.10) we obtain A + rA! = a, which can be written as 

d(rA ) 

-= a. 

dr 

Integrating this equation gives rA = a(r + k), where k is another integration 
constant. Thus the functions A(r) and B(r) arc given by 

A(r) = cr^l +and 5(r)=^l + - 

In solving for A and B we have used only the sum of equations (9.8) and (9.9), 
not the separate equations. It is, however, straightforward to check that, with these 
forms for A and B, the equations (9.8-9.11) arc satisfied separately. 

It can be seen that the integration constant k must in some way represent the 
mass of the object producing the gravitational field, as follows. We can identify 
k (and a) by considering the weak-field limit, in which we require that 

A(r ) 2$ 

c 2 c 2 

where is the Newtonian gravitational potential. Moreover, in the weak-field 
limit r can be identified as the radial distance, to a very good approximation. For 
a spherically symmetric mass M we thus have d> = —GM/r, and so we conclude 
that k = — 2GM/c 2 and a = c 2 . Therefore, the Schwarzschild metric for the empty 
spacetime outside a spherical body of mass M is 2 



(9.12) 

We will use this metric to investigate the physics in the vicinity of a spherical 
object of mass M, in particular the trajectories of freely falling massive particles 



We note that the constant a could have been identified earlier by making the additional assumption that 
spacetime is asymptotically flat, i.e. that the line element (9.4) tends to the Minkowski line element in the 
limit r —> oo. Thus we require that, in this limit, A(r) —> c 2 and B(r) —► 1 and so AB = c 2 . 
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and photons. The Schwarzschild metric is valid down to the surface of the spheri¬ 
cal object, at which point the empty-space field equations no longer hold. Clearly, 
the metric functions arc infinite at r = 2/x, which is known as the Schwarzschild 
radius. As we shall see, if the surface of the massive body contracts within this 
radius then the object becomes a Schwarzschild black hole (see Chapter 11). For 
the remainder of this chapter, however, we will restrict our attention to the region 
r > 2GM/c 2 . We will often use the shorthand /jl = GM/c 2 when writing down 
this metric. 


9.3 Birkhoff’s theorem 

If we do not demand that our original metric is static (or stationary) but only that 
it is isotropic, then we would substitute the more general form (9.3), 

ds 1 = A(t, r) dt 2 — B(t, r) dr 2 — r 2 (d0 2 + sin 2 6 clfi 2 ), 

into Einstein’s empty-space field equations R = 0 in order to determine the 
functions A(t, r) and B{t, r). On repeating our earlier analysis, we would find 
some additional non-zero connection coefficients and components of the Ricci 
tensor. However, on solving this new set of equations, one discovers that the 
resulting metric must still be the Schwarzschild metric (9.12). Thus, we obtain 
Birkhoff’s theorem , which states that the spacetime geometry outside a general 
spherically symmetric matter distribution is the Schwarzschild geometry. 

This is an unexpected result because in Newtonian theory spherical symmetry 
has nothing to do with time dependence. This highlights the special character of 
the empty-space Einstein equations and of the solutions they admit. In particular, 
Birkhoff’s theorem implies that if a spherically symmetric star undergoes strictly 
radial pulsations then it cannot propagate any disturbance into the surrounding 
space. Looking ahead to Chapter 18, this means that a radially pulsating star 
cannot emit gravitational waves. 

One can show that the converse of Birkhoff’s theorem is not true, i.e. a matter 
distribution that gives rise to the Schwarzschild geometry outside it need not be 
spherically symmetric. Indeed, some specific counter-examples arc known. 


9.4 Gravitational redshift for a fixed emitter and receiver 

We begin our discussion of the physics in the vicinity of a spherical mass M by 
considering the phenomenon of gravitational redshift. In particular, we consider 
the specific example of an emitter, at fixed spatial coordinates (r E , 6 E , fi E ), 
which emits a photon that is received by an observer at fixed spatial coordinates 
( r R , d R , 4> r ). If t E is the coordinate time of emission and t R the coordinate time 



9.4 Gravitational redshift for a fixed emitter and receiver 


203 


D (l R + A t R , r R , 0 R , (j ) R ) 


C (t R , r R , 9 r , (ji R ) 


Emitter Receiver 

at fixed at fixed 

( r E > $£> 4>e) i r R, Hr, <I>r) 

Figure 9.1 Schematic illustration of the emission and reception of two light signals. 

of reception then the photon travels from the event {t E , r E ,d E , <p E ) to the event 
{t R , r R , 6 r , 4> r ) along a null geodesic in the Schwarzschild spacetime. This is 
illustrated schematically in Figure 9.1, which also shows a second photon, emitted 
at a later coordinate time t E + A t E and received at t R + A t R . 

In Appendix 9A we present an approach for calculating gravitational redshifts 
in more general situations. Nevertheless, in this simple case, it is instructive to 
derive the result in a more elementary manner: we need only use the fact that the 
photon geodesic is a null curve. 3 Thus ds 2 = 0 at all points along it, and from the 
Schwarzschild metric (9.12), we find that 

c 2 ^1-—^ dt 2 = ^1-—^ dr 2 + r 2 dd 2 + r 2 sin 2 0 df > 2 , 

where we have written /jl = GM/c 2 . Let us consider the first signal. Thus, if a is 
some affine parameter along the null geodesic then we have 

dt 1 / dx l dx-i 

dcr c \ r ) _ da da _ 


3 This approach is based on that presented in J. Foster & J. D. Nightingale, A Short Course in General Relativity, 
Springer-Verlag, 1995. 
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where as before we use the notation [a ,j ] = (t, r, 6, <p), the g^ v arc the components 
of the Schwarzschild metric and Latin indices run from 1 to 3. On integrating, 
we obtain 


1 r a R ( 

-t E = ~ 1- 

C J a E V 


2 jJL 


- 1/2 r 




dx' dx-i 

da da 


1/2 


da. 


where a E is the value of a at emission and a R the value at reception. The 
important thing to notice about this expression is that the integral on the right- 
hand side depends only on the path through space. Thus, for a spatially fixed 
emitter and receiver, t R — t E is the same for all signals sent. Thus the coordinate 
time difference A t E separating events A and B is equal to the coordinate time 
A t R between events C and D, 


A t R — A t E . 


Now let us consider the proper time intervals along the worldlines of the emitter 
and receiver between each pair of events. Along both the emitter’s and receiver’s 
worldlines, dr = dd = d<b = 0. Thus, from the Schwarzschild line element (9.12), 
in both cases 

c 2 dr 2 = ds 1 = c 2 ^1 — 



Moreover, in each case r is constant along the worldline, so we can immediately 
integrate this equation to obtain 

AT£= ( 1- ^) ATE and ATR= ( 1_ ^) ATR ' 

Thus, since A t R = A t E , we find that 

= ( 1 — 2/r/r^ \ 1/2 
At e V1 — 2, p/r E ) 


which forms the basis of the formula for gravitational redshift. If we think of the 
two light signals as, for example, the two wavecrests of an electromagnetic wave, 
then it is clear that this ratio must also be the ratio of the period of the wave 
as observed by the receiver and emitter respectively. Thus the frequencies of the 
photon as measured by each observer are related by 


fR _ 

'1—2 GM/{r E c 2 Y 

1/2 

V E 

1-2 GM/(r R c 2 )_ 



which shows that v R < v E if r R > r E . The photon redshift z is defined by 


1 +Z=(v R /v e ) 1 
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There is an important point to notice about this derivation. It can be generalised 
very easily to any spacetime in which we can choose coordinates such that the 
spacetime is stationary (d^g^ = 0) and g 0i (x) = 0. In this case, 

ds 2 = g 0 o(*) dt 2 + gjj(x) dx 1 dx-i, 


where, as indicated, all the metric components arc independent of t. By repeating 
the above derivation for an emitter and observer at fixed spatial coordinates in 
this more general spacetime, we easily find that 


Yr _ 

<?oofe) 

1/2 

V E 

-goo(-U?)- 



The derivations presented here depend crucially upon the fact that the emitter 
and receiver are spatially fixed. However, this is not often physically realistic. 
For example, we might want to calculate the gravitational redshift of a photon 
if the emitter or receiver (or both) arc in free fall or moving in some arbitrary 
manner. A method for calculating redshifts in such general situations is given in 
Appendix 9A. In order to use this formalism, however, we require knowledge 
of the paths followed by freely falling particles and photons. Therefore, we now 
consider geodesics in the Schwarzschild geometry. 


9.5 Geodesics in the Schwarzschild geometry 

In deriving the Schwarzschild line element, 

ds 2 — c 2 ^1-—^ dt 2 — ^1-—^ dr 2 — r 2 dd 2 — r 2 sin 2 0 d(f) 2 , (9.15) 

we also calculated the connection coefficients T p vp for this metric. Thus we could 
now write down the geodesic equations for the Schwarzschild geometry in the 
form 

d 2 x /1 dx" dx p 

-hT M -=0, 

da 2 vp da dcr 

where <r is some affine parameter along the geodesic x p (o). It is more instruc¬ 
tive, however, to obtain the geodesic equations using the very neat ‘Lagrangian’ 
procedure discussed in Chapter 3. 

Thus, let us consider the ‘Lagrangian’ L = g i* 1 x v , where x 11 = dx^/da. 
Using (9.15), L is given by 

L = c 2 ( 1 -^P-( 1 -^~V-r2 (0 2 + sin 2 ^ 2 ). 


(9.16) 
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The geodesic equations arc then obtained by substituting this form for L into the 
the Euler-Lagrange equations 

d / 8L \ dL 
da \ hx! 1 ) dx 11 


Performing this calculation, we find that the four resulting geodesic equations (for 
/x = 0, 1,2, 3) are given by 

fl = (9.17) 

+ sm 2 0^) = 0, (9.18) 

.. 2 ■ • , 

6+ -rd — sin0cos0^)“ = 0, (9.19) 
r 

r 2 sin 2 6 4> = h. (9.20) 


In (9.17) and (9.20) respectively, the quantities k and h are constants. These two 
equations are derived immediately since L is not an explicit function of t or d>. 

We see immediately that 6= tt/ 2 satisfies the third geodesic equation (9.19). 
Because of the spherical symmetry of the Schwarzschild metric we can therefore, 
with no loss of generality, confine our attention to particles moving in the ‘equato¬ 
rial plane’ given by 6 = tt/2. In this case our set of geodesic equations reduces to 



-l 


.. /xc 2 ., 

r+ r 



1 

1 to 

ii 

j*r- 

(9.21) 

V r / 


o' 

II 

•% 

1 

=tri. 

(9.22) 

II 

•-e- 

(9.23) 


These equations arc valid for both null and non-null affinely parameterised 
geodesics. In each of these cases, however, it is easier to replace the rather 
complicated r-equation (9.22) by a first integral of the geodesics equations. For 
a non-null geodesic this first integral is simply 

gij LV x lx x v = c 2 , (9.24) 

whereas for a null geodesic it is 

g^k v = 0. (9.25) 

Before going on to discuss separately non-null and null geodesics in the equa¬ 
torial plane 6 = tt /2, it is instructive to discuss the physical interpretation of the 
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constants k and h. One can arrive at equations (9.21) and (9.23) simply by using 
the fact that the components p 0 and /? 3 of a particle’s 4-momentum arc conserved 
along geodesics since L does not depend explicitly on t and 4> (remember that p 
is proportional to the tangent vector to the geodesic at each point). For notational 
simplicity, for a massive particle, we shall take the particle to have unit rest 
mass and choose the affine parameter to be the particle’s proper time r, so that 
p lx = x 11 . Similarly, for a massless particle we are free to choose an appropriate 
affine parameter along the null geodesic, once again such that p 11 = x ,x . Thus, for 
6 = 7r/2 we may write 


l-^) = kc 2 , (9.26) 

Pi = g33(t>= -r 2 4>= -h, (9.27) 

where in the last equality on each line we have defined the constants in a manner 
that coincides with (9.21) and (9.23). Let us first consider the constant k. If, at some 
event, an observer with 4-velocity u encountered a particle with 4-momemtum p 
then he would measure the particle’s energy to be 

E =pu = p^u^. 

For an observer at rest at infinity we have [m^] = (1,0, 0, 0) and so E = p 0 = kc 2 
(which is conserved along the particle geodesic). Thus we may take k = E/c 2 , 
where E is the total energy of the particle in its orbit. Since for massive particles 
we have assumed unit mass, in the general case we have k = E/(m Q c 2 ), where 
m 0 is the rest mass of the particle. For the constant h, we can see immediately 
from (9.27) that it equals the specific angular momentum of the particle and that 
(as result of the choice of signature for the metric) p 3 is equal to minus the 
specific angular momentum. Finally, we note that the results (9.17-9.20) can also 
be derived using the alternative form (3.56) of the geodesic equations, which may 
be written 


Po — Soot — c ~ 


Pa = l 2(9a8va)p l 'p‘ T - 


(9.28) 


9.6 Trajectories of massive particles 

The trajectory of a massive particle is a timelike geodesic. Considering motion in 
the equatorial plane, we replace the geodesic equation (9.22) by (9.24), where g l±l , 
is taken from (9.15) with 6 = tt/2. Moreover, since we arc considering a timelike 
geodesic we can choose our affine parameter a to be the proper time r along the 



208 


The Schwarzschild geometry 


path. Thus we find that the worldline W (t) of a massive particle moving in the 
equatorial plane of the Schwarzschild geometry must satisfy the equations 




r 2 4> 


= k, 

(9.29) 

= c 2 . 

(9.30) 

= h. 

(9.31) 


By substituting (9.29) and (9.31) into (9.30), we obtain the combined ‘energy’ 
equation for the r-coordinate, 

(9.32) 

where we have written /i = GM/c 2 . We shall use this ‘energy’ equation to discuss 
radial free fall and the stability of orbits. Note that the right-hand side is a constant 
of the motion. We can verify the physical meaning of the constant k by noting 
that E oak. The constant of proportionality is fixed by requiring that, for a particle 
at rest at r = oo, we have E = m 0 c 2 . Letting r —> oo and r = 0 in (9.32), we thus 
require k 2 = 1. Hence, as previously, we must have k = E/(m {) c 2 ), where E is 
the total energy of the particle in its orbit. 

A second useful equation, which enables us to determine the shape of a particle 
orbit (i.e. r as a function of <fi), may be found by using h = r 2 cj) to express r in 
the energy equation (9.32) as 

dr dr dcf) h dr 
dr d<j) dr r 2 dcj) 

We thus obtain 





= c 2 (k 2 - 1) + 


2 GM 
-+ 


2 GMh 2 
c 2 r 3 


If we make the substitution u = l/r that is usually employed in Newtonian orbit 
calculations, we find that 



c 2 , IGMti IGMiT 

^ 2 -d+—+—■ 


We now differentiate this equation with respect to <p to obtain finally 
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In Newtonian gravity, the equations of motion of a particle of mass m in the 
equatorial plane 6 = tt/2 may be determined from the Lagrangian 


L 


2 m{r 2 + r 2 f> 2 ) + 


GMm 

r 


From the Euler-Lagrange equations we have 


r 2 (f> = h, 


r = 


h GM 

y 3 y2 ’ 


where the integration constant h is the specific angular momentum of the particle. 
If we now substitute u = 1/r and eliminate the time variable, the Newtonian 
equation of motion for planetary orbits is obtained: 


d 2 u _ GM 
d(j) 2 h 2 


(9.34) 


We must remember, however, that in this equation it = 1/r, where r is the radial 
distance from the mass, whereas in (9.33) r is a radial coordinate that is related 
to distance through the metric. Nevertheless, the forms of the two equations arc 
very similar except for the extra term 3 GMtr/c 2 in (9.33). We note that this term 
correctly tends to zero as c —> oc. 

Two interesting special cases of massive-particle orbits arc worth investigating 
in detail, namely radial motion and motion in a circle. 


9.7 Radial motion of massive particles 

For radial motion <J) is constant, which implies that h = 0. Thus, (9.32) reduces to 

? 9,9 . 2 GM 

r 2 = c 2 (k 2 — 1) +-. (9.35) 

r 

Differentiating this equation with respect to r and dividing through by r gives 


r = 


GM 

r 2 


(9.36) 


which has precisely the same form as the corresponding equation of motion in 
Newtonian gravity. This does not imply, however, that general relativity and 
Newtonian gravity predict the same physical behaviour. It should be remem¬ 
bered that in (9.36) the coordinate r is not the radial distance, and dots indicate 
derivatives with respect to proper time rather than universal time. 
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As a specific example, consider a particle dropped from rest at r = R. From 
(9.35) we see immediately that k 2 = 1 —2 GM/(c 2 R), so (9.35) can be written 


r 2 /II 

— = GM[ - 


(9.37) 


This has the same form as the Newtonian formula equating the gain in kinetic 
energy to the loss in gravitational potential energy for a particle (of unit mass) 
falling from rest at r = R. This provides a useful way to remember this equation, 
but the different meanings of r and of the dot should again be borne in mind. 

We could continue our analysis of this quite general situation, but we can 
illustrate the main physical points by considering a particle dropped from rest at 
infinity. In this case k = 1 and the algebra is much less complicated. Thus, setting 
k = 1 in the geodesic equation (9.29) and in (9.35), we obtain 


dt / 2 1 

dr V r ) 


dr /2 /jlc 2 \ 1 ^ 2 

dr \ r ) 


(9.38) 

(9.39) 


where in (9.39) we have taken the negative square root. These equations form the 
basis of our discussion of a radially infalling particle dropped from rest at infinity. 
From these equations we see immediately that the components of the 4-velocity 
of the particle in the (t, r, 6, ch) coordinate system are simply 




~dx* 

dr 




Equation (9.39) determines the trajectory r( r). On integrating (9.39) we imme¬ 
diately obtain 


T 


2_nr_2_r^ 

3 y 2 /jlc 2 3 y 2 jjic 2 ’ 


where we have written the integration constant in a form such that t = 0 at r = r 0 . 
Thus t is the proper time experienced by the particle in falling from r = r 0 to a 
coordinate radius r. 

Instead of parameterising the worldline in terms of the proper time r, we can 
alternatively describe the path as r( t ). thereby mapping out the trajectory of the 
particle in the (t, r) coordinate plane. This is easily achieved by writing 


dr dr dr 
dt dr dt 




1/2 


(9.40) 
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On integrating, we find 



where the choice of the integration constant gives t = 0 at r = r 0 . 
In particular we note that 


T ~^ 3 


r 3 

r o 

2/xc 2 


0 , 

2/x. 


Evidently, the particle takes a finite proper time to reach r = 0. When the worldline 
is expressed in the form r( t ), however, we see that r asymptotically approaches 2/x 
as t — oc. Since the coordinate time t corresponds to the proper time experienced 
by a stationary observer at large radius, we must therefore conclude that, to such 
an observer, it takes an infinite time for the particle to reach r = 2/x. We return 
to this point later. 

It is interesting to ask what velocity a stationary observer at r measures for 
the infalling particle as it passes. From the Schwarzschild metric (9.15) we see 
that, for a stationary observer at coordinate radius r, a coordinate time interval dt 
corresponds to a proper time interval 


df = 



dt. 


Similarly, a radial coordinate separation dr corresponds to a proper radial distance 
measured by the observer equal to 


dr' = 



dr. 


Thus the velocity of the radially infalling particle, as measured by a stationary 
observer at r, is given by 


dr' / 2/x \ 1 dr 

dt' \ r ) dt \ r / 


(9.41) 


Thus we find the rather surprising result that, as the particle approaches r = 2/x, 
a stationary observer at that radius observes that the particle’s velocity tends to c. 
We note that the equation (9.41) is only physically valid for r > 2/x since, as we 
shall see, it is impossible to have a stationary observer at r < 2/x. 



212 


The Schwarzschild geometry 

9.8 Circular motion of massive particles 


For circular motion in the equatorial plane we have r = constant, and so r — r — 0. 
Setting u — 1 /r — constant in the ‘shape’ equation (9.33) we have 

GM 3 GM 2 

U — —y 1 - y—U , 


from which we find that 


r — 3/x 

Putting r = 0 in the energy equation (9.32) and substituting the above expression 
for h 2 allows us to identify the constant k: 


1 — 2 fji/r 
(1 — 3/r/r) 1 / 2 


(9.42) 


The energy of a particle of rest mass m Q in a circular of radius r is then given by 
E = km 0 c 2 . We can use this result to determine which circular orbits arc bound. 
For this we require E < tn 0 c 2 , so the limits on r for the orbit to be bound arc 
given by k = 1. This yields 


(1 —2 n/r) 2 = 1 —3 fji/r. 


which is satisfied when r = 4/r or r = oo. Thus, over the range 4/jl < r < oc, 
circular orbits arc bound. A plot of E/(m 0 c 2 ) as a function of r//i is shown in 
Figure 9.2. 



Figure 9.2 The variation of k = E/(m 0 c 2 ) as a function of r//r for a circular 
orbit of a massive particle in the Schwarzschild geometry. 
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We can obtain another useful result by substituting our expression for h 2 into 
the geodesic equation r 2 (f> = h: then we can write 

/ dcj )\ 2 fJLC 2 

\ dr J r 2 (r — 3/x) 

The significance of this equation is that it cannot be satisfied for circular orbits 
with r < 3/r. Such orbits cannot be geodesics (since they do not satisfy the 
geodesic equations) and so cannot be followed by freely falling particles. Thus, 
according to general relativity a free massive particle cannot maintain a circular 
orbit with r < 3/i around a spherical massive body, no matter how large the 
angular momentum of the particle. This is very different from Newtonian theory. 
It is also useful to calculate the expression for dcjr/dt, which is given by 

/ df>\- / dcj) dr \ 2 (1 — 2/i/r) 2 b dcj) \ 2 /xc 2 GM 

\ dt) \dr dt ) k 2 \Jt/ r 3 r 3 

This expression is exactly the same as the Newtonian expression for the period 
of a circular orbit of radius r. Although we cannot say that r is the radius of 
the orbit in the relativistic case, we see that the spatial distance travelled in one 
complete revolution is 2irr, just as in the Newtonian case. 


9.9 Stability of massive particle orbits 

The above analysis appears to suggest that the closest bound circular orbit around 
a massive spherical body is at r = 4/x. However, we have not yet determined 
whether this orbit is stable. 

In Newtonian dynamics the equation of motion of a particle in a central potential 
can be written 

1 (dr \ 2 

-(-) + V al(r) = E. 


where V eff (r) is the effective potential and E is the total energy of the particle 
per unit mass. For an orbit around a spherical mass M, the effective potential is 


Veff to = - 


GM 

r 



(9.43) 


where h is the specific angular momentum of the particle. This effective potential 
is shown in Figure 9.3. It can be seen that bound orbits have two turning points 
and that a circular orbit corresponds to the special case where the particle sits at 
the minimum of the effective potential. Furthermore one sees that, in Newtonian 
dynamics, a finite angular momentum provides an angular momentum barrier 
preventing a particle reaching r = 0. This is not true in general relativity. 
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Figure 9.3 The Newtonian effective potential for h / 0, showing how an angular 
momentum barrier prevents particles reaching r — 0. 


In general relativity, the ‘energy’ equation (9.32) for the motion of a particle 
around a central mass can be written 

2 


l(- I + 


2 V dr 


Ir 
2 r- 


i — V|_ = 0> 

r r 2 


where we recall that the constant k = E/(m 0 c 2 ). Thus in general relativity we 
identify the effective potential per unit mass as 


v.„M—^ + 


(9.44) 


which has an additional term proportional to 1 /r 3 as compared with the Newtonian 
case (9.43). Remembering that /i = GM/c 2 , we see that (9.44) reduces to the 
form (9.43) in the non-relativistic limit c —»■ oo. 

Figure 9.4 shows the general relativistic effective potential for several values of 
h = h/(c/ji). The dots indicate the locations of stable circular orbits, which occur 
at the local mi nimum of the potential. The local maxima in the potential curves 
arc the locations of unstable circular orbits. For any given value of h, circular 
orbits occur where dV c(( /dr = 0. Differentiating (9.44) gives 


dV gff lie 2 h 2 3 jih 2 

dr r 2 r 4 


and so the extrema of the effective potential arc located at the solutions of the 
quadratic equation 


lic 2 r 2 — h 2 r + 3fih 2 = 0, 
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Figure 9.4 The general relativistic effective potential plotted for several values 
of the angular momentum parameter h. 

which occur at 

r = ——— (h ±y/h 2 — 12/x 2 c 2 ^ . 

We note, in particular, that if h = \[V2\±c = 2V3 /jlc then there is only one 
extremum , and there are no turning points in the orbit for lower values of h. The 
significance of this result is that the innermost stable circular orbit has 

6 GM 

r min — ^ A 1 = ~ • 

c 

This orbit, with r = 6/jl and h/(/xc) = 2^3, is unique in satisfying both 
dV e f[/dr = 0 and drV^/dr 1 = 0, the latter being the condition for marginal 
stability of the orbit. 

The existence of an innermost stable orbit has some interesting astrophysical 
consequences. Gas in an accretion disc around a massive compact central body 
settles into circular orbits around the compact object. However, the gas slowly 
loses angular momentum because of turbulent viscosity (the turbulence is thought 
to be generated by magnetohydrodynamic instabilities). As the gas loses angular 
momentum it moves slowly inwards, losing gravitational potential energy and 
heating up. Eventually it has lost enough angular momentum that it can no longer 
follow a stable circular orbit, and so it spirals rapidly inwards onto the central 
object. 

We can make an estimate of the efficiency of energy radiation in an accretion 
disc. The maximum efficiency is of the order of the ‘gravitational binding energy’ 
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at the innermost stable circular orbit (i.e. the energy E lost as the particle moves 
from infinity to the innermost orbit) divided by the rest mass energy of the particle. 
Setting r — 6/r in (9.42) and remembering that k = E/(m () c 2 ), we find that 


E 

m 0 c 2 


2V2 

^r 


0.943. 


Thus the maximum radiation efficiency of the accretion disc is 


e a cc~ 1-0.943 = 5.7%. 


Thus, an accretion disc around a highly compact astrophysical object can convert 
perhaps a few percent of the rest mass energy of the gas into radiation; this may be 
compared with the efficiency of nuclear burning of hydrogen to helium (26 MeV 
per He nucleus), 

^nuclear ~ 0.7%. 

Accretion discs are therefore capable of converting rest mass energy into radiation 
with an efficiency that is about 10 times greater than the efficiency of the nuclear 
burning of hydrogen. The ‘accretion power’ of highly compact objects (such as 
black holes) cause some of the most energetic phenomena known in the universe. 

A physically intuitive picture of a non-circular orbit and the capture of a particle 
with non-zero angular momentum h may be obtained by differentiating the energy 



Figure 9.5 Orbit for a particle projected azimuthally from r = 20 GM/c 2 with 
h = 3.5 GM/c. A circular orbit would require h — (20/VT7 )GM/c. The points 
are plotted at equal intervals of the particle’s proper time. 
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equation (9.32) for massive particle orbits with respect to proper time r. Using 
the original equation to remove the first derivative dr/dr, we find that 

d 2 r _ GM h 2 3 IrGM 

dr 2 r 2 r 3 c 2 r 4 

As we might expect, the first two terms on the right-hand side arc very like 
the Newtonian expressions corresponding to an inward gravitational force and a 
repulsive ‘centrifugal force' proportional to hr. The third term is new, however, 
and is also proportional to hr but this time acts inwards. This shows that close to a 
highly compact object, specifically within the radius r = 3GM/c 2 , the centrifugal 
force ‘changes sign’ and is directed inwards, thus hastening the demise of any 
particle that strays too close to the object. This leads to spiral orbits of the type 
shown in Figure 9.5. 


9.10 Trajectories of photons 

The trajectory of a photon (and of any other particle having zero rest mass) is a 
null geodesic. We cannot use the proper time t as a parameter, so instead we use 
some affine parameter cr along the geodesic. Considering motion in the equatorial 
plane, the equations of motion arc given by the geodesic equations (9.21) and 
(9.23), and we replace the r-equation (9.22) by the condition g IJLl ,x IJ x 1 ' = 0. Thus 
we have 


(l - * = k > ( 9 - 45 ) 

r 2 — r 2 f> 2 = 0, (9.46) 

r 2 cf> — h. (9.47) 


For photon trajectories, an analogue of the energy equation (9.32) can again be 
obtained by substituting (9.45) and (9.47) into (9.46), which gives 



= c 2 k 2 


(9.48) 


Similarly, the analogue for photons of the shape equation (9.33) is obtained by 
substituting h = r 2 <J) into (9.46) and using the fact that 

dr dr d(f> h dr 
da dcj) da r 2 df> 
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Making the usual substitution u = l/r and differentiating with respect to <b we 
find 

(9.49) 

It is again worth mentioning the two special cases of radial motion and motion in 
a circle. 

9.11 Radial motion of photons 

For radial motion (b = 0 and (9.46) reduces to 



from which we obtain 

(9.50) 

On integrating, we have 

r 

ct = r + 2/rln-1 + constant (outgoing photon), 

2/jl 

r 

ct — —r — IfjL In-1 + constant (incoming photon). 

2/jl 

Notice that under the transformation t —»■ —t, incoming and outgoing photon paths 
are interchanged, as we would expect. In fact for the moment the differential 
equation (9.50) is more useful. In a (ct, r)-diagram, we see that the photon 
worldlines will have slopes ± 1 as r —»■ oo (forming the standard special-relativistic 
lightcone), but their slopes approach ±oo as r —► 2/jl. This means that they become 
more vertical; the cone ‘closes up’. 

Our knowledge of the lightcone structure allows us to construct the ‘picture’ 
behind our earlier algebraic result that a particle takes infinite coordinate time to 
reach the horizon; this is illustrated in Figure 9.6. The curved solid line is the 
worldline of a massive particle dropped from rest by an observer fixed at r = R. 
Since massive particle worldlines arc confined within the forward lightcone in any 
event, the closing up of the lightcones forces the worldlines of massive particles 
to become more vertical as r —> 2/jl. Thus, the particle ‘reaches’ r = 2/jl only 
at t — oo. Further, suppose that at some point along its trajectory the particle 
emits a radially outgoing photon in the direction of the observer. The tangent to 
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Figure 9.6 A radially infalling particle emitting a radially outgoing photon. The 
wavy line indicates the singularity at r — 0. 

the resulting photon worldline must, at any event, lie along the outward-pointing 
forward lightcone at that point. This is illustrated by the broken line in Figure 9.6. 
Thus, in the limit where the particle approaches r = 2/x, the initial direction of 
the photon worldline approaches the vertical and so the photon will be received 
by the observer only at t = oo. Thus to an external observer the particle appeal's 
to take an infinite time to reach the horizon. 

As discussed earlier, however, the proper time t experienced by a massive 
particle in falling to r = 2/x is finite. Moreover, dr/ dr does not tend to zero at this 
point, so the particle has not ‘run out of steam’ and presumably passes beyond 
this threshold. Thus, our present coordinate system is inadequate for discussing 
what happens at and within r = 2/x, and our (ct, /-(-diagram is in some respects 
misleading in these regions. We discuss this further in Chapter 11. 


9.12 Circular motion of photons 

For motion in a circle we have r = constant. Thus, from the shape equation (9.49), 
we see that the only possible radius for a circular photon orbit is 


3 GM 
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Therefore a massive object can have a considerable effect on the path of a photon. 
There is no such orbit around the Sun, for example, since the solar radius is much 
larger than 3 GM Q /c 2 *=» 4.5 km, but outside a black hole there can be such an 
orbit. As we shall see below, however, the orbit is not stable. 


9.13 Stability of photon orbits 

We can rewrite the ‘energy’ equation (9.48) for photon orbits as 

£ + Vdr(r) = i, (Ml) 

where we have defined the quantity b = h/(ck) and the effective potental 



In fact, by rescaling the affine parameter along the photon geodesic in such a way 
that A —»■ //A the explicit /;-dependence in (9.51) may be removed. 

The effective potential is plotted in Figure 9.7, from which we see that V eff (r) 
has a single maximum at r = 3/x, where the value of the potential is I /(27/x 2 ). 
Thus the circular orbit at r = 3/x is unstable. We conclude that there arc no stable 
circular photon orbits in the Schwarzschild geometry. 

The character of general photon orbits is determined by the value of the 
constant b. To find the physical meaning of b , we begin by using the geodesic 
equation (9.45) and the energy equation (9.47) to write 

def) <t> 1 
dr r r 2 



r = 2/i r = 3/i 



Figure 9.7 The effective potential for photon orbits. 
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Figure 9.8 The shape of a photon orbit passing a spherical mass if b > 3 V3fi. 
Thus, for a photon orbit, as r —»■ oo we have 



Assuming that 4> —»■ 0 as r — > oo, the solution to this equation is 

b 

sin <p 

which gives the equations of two straight lines with impact parameter b passing 
on either side of the origin. 

The nature of the orbits depends very much on the value of the impact param¬ 
eter b. Let us first consider inward-moving photons, i.e. photons for which r is 
initially decreasing. From (9.51) and Figure 9.7 we see that if 1 /b 2 < 
so that b > 3\/3/x, then the orbit will have a single turning point of closest 
approach and escape again to infinity. This situation is illustrated in Figure 9.8. 
If b < 3\/3/x. however, then the light ray will be captured by the massive body 
and spiral in towards the origin. 

Similar considerations hold for trajectories that start at small radii. If b > 3 \/3p 
then the photon will escape, and at infinity its straight-line path will have an impact 
parameter b. If b < 3\/3/x then the photon path will have a turning point, and it 
will fall back towards the origin. In this case the particle does not reach infinity, 
so b cannot be interpreted simply as an impact parameter. It is straightfoward to 
show that if a photon is emitted from within the region r = 2/jl to r = 3 j± then the 
opening angle a from the radial direction for the photon to escape varies from 
a = 0 at r = 2/jl to a = tt/ 2 at r — 3p. 


Appendix 9A: General approach to gravitational redshifts 

Consider a general spacetime with metric g^, in some arbitrary coordinate system 
where x° is a timelike coordinate and the x‘ arc spacelike. Suppose that an 
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Figure 9.9 Schematic illustration of the emission and reception of a photon. 


emitter £ and a receiver X have worldlines x e (t e ) and x r (t r ) respectively, where 
t e and t r are the proper times of each observer. At some event A, £ emits a 
photon with 4-momentum p{A) that is received by X at an event B. Furthermore, 
let us assume that at event A the emitter £ has 4-velocity u F ( A) and that at event 
B the receiver has 4-velocity u R (B). This is illustrated schematically in Figure 9.9. 

The energies of the photon as observed by the emitter at A and by the receiver 
at B arc respectively given by 

E(A) =p(A)-u E (A) = Pfl (A)u%(A), 

E(B)=p(B).u R (B) = p fl (B)u ,x R (B). 

Since in both cases E = hv, the ratio of the photon frequencies is given by the 
general result 

v r _ jVCgKCg) . , 

v E ^(A)4(A)' 1 J 


If we know the components of the 4-nromentum Pll (A) at emission then we can 
calculate the components PfJL ( B ) at reception, using the fact that the photon travels 
along a null geodesic. Since the photon 4-momentum p at any point is tangent 
to this geodesic, it is parallel-transported along the path. Thus, if the photon 
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geodesic x p (a) is described in terms of some affine parameter a then 


dPn 

dcr 


r, 


dx p 

da 


= 0 . 


Moreover, since p is tangent to the geodesics, we can choose the affine parameter 
a so that p p = dx^/da, in which case 


dPp. 

da 


r, 


,PvP 


p 


It is also worth remembering that a first integral of the equation for a photon 
geodesic, which can prove very useful, is 


/v ;/x = °- 


(9.53) 


Let us now examine some special cases of the general formula (9.52). We 
begin by considering the case in which both the emitter £ and the receiver X 
ha ve fixed spatial coordinates. Thus, for i = 1,2,3 the spatial components of 
their 4-velocities arc 

,• dx‘ p ■ dx'p 

u F = -= 0 and u R = -= 0. 

dr E dr R 

Moreover, in each case, the squared length of the 4-velocity is g fJLV u p 'u v = c 2 . In 
our situation, this reduces to g l)() (u°) 2 = c 2 , so we find that 

M 0__£_ 

feoo) 1/2 ' 

Hence the formula (9.52) reduces to 


#00 (- A ) 

1/2 

.800(B). 



'r_ _ Po(B) 
'e Pq(A) 


(9.54) 


Let us now make the additional assumption that the metric is stationary in our 
chosen cooordinate system, i.e. 


d o8nv = 0 - 

Thus, the metric components g l±l , cannot depend explicitly on the coordinate jc°. 
As shown in Section 3.19, this means that the zeroth covariant component of the 
tangent vector is constant along an affinely parameterised geodesic. Since the 
photon 4-momentum is simply proportional to the tangent vector, this means that 




224 The Schwarzschild geometry 

p () is constant along the photon’s geodesic. Thus, in this case, (9.54) reduces 
further to 

(9.55) 

and we have recovered the result (9.14) derived earlier. 

Exercises 

9.1 Show that surfaces of constant t and r in the general isotropic metric (9.3) have 
surface area 47 tt 2 . 

9.2 For the general static isotropic metric (9.4), show that the off-diagonal components of 
the Ricci tensor R are zero and that the diagonal components are given by (9.8-9.11). 

9.3 The Schwarzschild line element is 

ds 2 = c 2 (l - ^ dt 2 - (l - ^ 1 dr 2 - r 2 d6 2 - r 2 sin 2 6 dcj> 2 . 

By considering the ‘Lagrangian’ L = g^x^.x v , where the dots denote differentiation 
with respect to an affine parameter A, calculate the connection coefficients I ’ M m . 
Hence verify that the geodesic equations are given by (9.17-9.20). 

9.4 Derive the results (9.17-9.20) using the alternative form (9.28) of the geodesic 
equations. 

9.5 Calculate the connection coefficients and the Ricci tensor for the general isotropic 
metric (9.3). Hence prove Birkoff’s theorem. 

9.6 Use Birkhoff’s theorem to show that a particle inside a spherical shell of matter 
experiences no gravitational force. 

9.7 Show that the 'Lemaitre' line element 

2 2 2 4 r i 2/3 2 r, 2 i 2/3 

ds 2 — c 2 dw" -- dz" — —(z — cw)~ d fl“, 

9 \_2(z — cw)\ l 2 

where dkl 2 — dd 2 + sin 2 0 d<jr , describes the Schwarzschild geometry. Show that 
observers with fixed spatial coordinates ( z, 0, (b) are in free fall and had zero velocity 
at infinity, and that the proper time of such observers is w. 

9.8 For a general stationary spacetime with line element 

ds 2 — g 00 Cx) dt 2 + gjj(x) dx‘ dx\ 

show that, for a fixed emitter and receiver, the ratio of the received photon frequency 
to the emitted frequency is 

v r _ 8oo(*e) 

v e .8oo(xr)_ 

where x E and x R are the fixed spatial coordinates of the emitter and receiver 
respectively. 





Exercises 


225 


9.9 An isolated thin rigid spherical shell has mass M and radius R. Suppose that a 
small hole is drilled through the shell, so that an observer O at the shell’s centre 
can observe the outside universe. Show that a photon emitted by a fixed observer 
E at r — r E (where r E > R ) and received by O is blueshifted by the amount 

v o = ( i-^/oA 172 

v E V 1 - 2 V/ R ) 

9.10 Show that the quantity 



where p is the 4-momentum of a particle, is a constant of motion along any 
geodesic in the Schwarzschild geometry. Hence show that the particle orbits in a 
Schwarzschild geometry are stably planar. 

9.11 For a particle dropped from rest at infinity in the Schwarzschild geometry, find 
expressions for t(r) and r(r), where t is the coordinate time and r is the proper 
time of the particle. 

9.12 A particle is dropped from rest at a coordinate radius r = R in the Schwarzschild 
geometry. Obtain an expression for the 4-velocity of the particle in ( t , r, 9, <b) 
coordinates when it passes coordinate radius r. 

9.13 A particle at infinity in the Schwarzschild geometry is moving radially inwards with 
coordinate speed u 0 . Show that at any coordinate radius r the coordinate velocity 
is given by 



where y 0 — (1 — ujjc 2 ) . Determine the velocity relative to a stationary observer 
at r, and show that this velocity tends to c as r tends to 2 GM/c 2 , irrespective of 
the value of u 0 . 

9.14 Suppose that the particle in Exercise 9.13 has rest mass m 0 and that it stopped at 
r = r x . If its excess energy was converted to radiation that is observed at infinity, 
show that the energy released as seen by a stationary observer at r x is 


E — m 0 c 2 



What is the energy released as observed at infinity? Show that this tends to y 0 m 0 c 2 
as tends to 2 GM/c 2 . 

9.15 For a particle in a circular orbit of radius r in the Schwarzschild geometry, use the 
alternative form (9.28) of the geodesic equations to show that 


d<f> _ GM 
dt r 3 


9.16 In the Schwarzschild geometry, a photon is emitted from a coordinate radius r — r 2 
and travels radially inwards until it is reflected by a fixed mirror at r — r,, so 
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that it travels radially outwards back to r — r 2 . How long does the round trip take 
according to a stationary observer in infinity? 

9.17 A photon moves in a circular orbit at r — 3/jl in the Schwarzschild geometry. Show 
that the period of the orbit as measured by a stationary observer at this radius is 
r = 67 Tjjb/c. What is the period of the orbit as measured by a stationary observer 
infinity? 

9.18 Show that a massive particle moving in the innermost stable circular orbit in the 
Schwarzschild geometry has speed c/2 as measured by a stationary observer at this 
radius. Hence calculate the period of the orbit as measured by the local observer. 
What is the period of the orbit as measured by a stationary observer infinity? 

9.19 Alice is situated at a fixed position on the equator of the Earth (which is assumed to 
be spherical). In Schwarzschild coordinates ( t, r. 6, <f>), her worldline is described 
in terms of a parameter r by 

t = yr, r — R, 6 — 77 / 2 , <f> = wt. 


where y and co are constants and R is the coordinate radius of the Earth’s surface. 
Bob is a distant stationary observer in space. Show that he will measure the orbital 
speed of Alice to be v = Roj/y. By considering the magnitude of Alice’s 4-velocity, 
show that 


y = 





where M is the mass of the Earth. Interpret this result physically. 

9.20 All massive objects look larger than they really are. Show that a light ray grazing 
the surface of a massive sphere of coordinate radius r > 3 GM/c 2 will arrive at 
infinity with impact parameter 

b ~ r (r-2GM/c 2 ) 

Hence show that the apparent diameter of the Sun (M 0 = 2 x 10’° kg. R e = 
7 x 10 s m) exceeds the coordinate diameter by nearly 3 km. 

9.21 The Hipparcos satellite can measure the positions of stars to an accuracy of 0.001 
arcseconds. If it is measuring the position of a star in a direction perpendicular 
to the plane of the Earth’s orbit, do Hipparcos observers need to account for the 
gravitational bending of light by the Sun? 

9.22 A massive particle is moving in the equatorial plane of the Schwarzschild geometry. 
Show that at infinity the particle moves in a straight line with impact parameter 
b = h/(c'/ k 2 — 1). 

9.23 An observer at rest at coordinate radius r — R in the Schwarzschild geometry 
drops a massive particle which free-falls radially inwards. When the particle is at a 
coordinate radius r — r E it emits a photon radially outwards. Find an expression for 
the redshift z of the photon when it is received by the observer. Show that z —> 00 
as r E -» 2 GM/c 2 . 



Exercises 


227 


9.24 Show that the geodesic equations for photon motion in the equatorial plane 6 — tt/2 
of the Schwarzschild geometry can be written in the form 



1 

b 1 


U(r), 


where b is a constant, the dots correspond to differentiation with respect to some 
affine parameter and 


m = 



Suppose that a photon moving in the equatorial plane passes an observer at rest at 
a coordinate radius in the range 2/jl < r < 3/r. Show that the observer measures the 
radial and azimuthmal components of the photon’s velocity to be 

v r — ±c [l — b 2 U(r)] 1/2 and = cb[U(r)] 1/2 . 


If the observer emits a photon that makes an angle a with the outward radial 
direction, show that the photon will escape to infinity provided that 

sin a < 3V3/4f/(r)] 1/2 . 


Find the values of a at r — 2/a and r — 3/r. 

9.25 Alice and Bob are astronauts in a space capsule, with no engine, in a circular orbit 
at r — R (where R > 3/x) in the equatorial plane of the Schwarzschild geometry. At 
some point in the orbit, Bob leaves the capsule, uses his rocket-pack to maintain 
a hovering position at that fixed point in space and then rejoins the capsule after 
it has completed one orbit. Show that the proper time interval measured by Alice 
while Bob is out of the capsule is 


At 4 = 2ir 


- 2 (R- 3 m ) 

M-c 


If A t b is the corresponding proper time interval measured by Bob between these 
two events, show that 


ATg = / R — 2/i \ 1/2 
At a \R-3ti) 


Briefly compare this result with the ‘twin paradox’ result in special relativity. If 
Bob chooses not to rejoin the capsule but instead observes it fly past him, show 
that he will measure the capsule’s speed as 


v = 


jJLC~ 


R — 2 fju 


1/2 


9.26 A particle A and its antiparticle B are travelling in opposite senses in free circular 
orbits in the equatorial plane of the Schwarzschild geometry, one at coordinate 
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radius r A and the other at r B (> r A ). At some instant, A emits a photon of frequency 
v A that travels radially and is received by B with frequency v B . Show that 

v iy = ( 1 ~ 3yu,/r A \ 1/2 
v A V1 — 3 fi/r B ) 

Suppose now that r A = r B — r, so that the particles collide and annihilate each other. 
Show that the total radiated energy as measured by an observer at rest at the point 
of collision is given by 


E — 2 m 0 c 2 


/ 1 — 2fi/r \ 1/2 
VI-3 fJL/r) 


where m 0 is the rest mass of each particle. 

9.27 If the cosmological constant A is non-zero, show that the line element outside a 
static spherically symmetric matter distribution is given by 


ds 2 = c 


2 





dr 2 — r 2 (dd 2 + sin 2 ddcj) 2 ). 


Hence show that, in the weak-field Newtonian limit, a spherically symmetric mass 
M produces a gravitational field strength g given by 


g = 




Show that the shapes of massive particle orbits in the above geometry differ 
from those in the Schwarzschild geometry, but that the shapes of photon orbits 
do not. 

9.28 Consider a static axisymmetric spacetime that is invariant under translations and 
reflections along the axis of symmetry. Show that, in general, the line element for 
such a spacetime can be written in the form 


ds 2 — A(p ) dt 2 — dp 2 — B(p) d(j) 2 — C(p) dz 2 . 


for arbitrary functions A, B and C. Show that the non-zero connection coefficients 
for this line element are given by 




C' 

T ’ 


c 

2C' 
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where the primes denote d/dp. Hence show that the non-zero components of the 
Ricci tensor are given by 

A" A' / A' B' C \ 

R °o ~ ~Y + Y \2A~2B~2c) ’ 

r _ A” {A') 2 B" (B') 2 C" ( Cf_ 

11 2A 4A 2 + 2 B 4 B 2 + 2C 4C 2 ’ 

_ B" B' / B' A' C \ 

22 ~~ ~2 ~ ~2 VlS _ 2A ~ 2CJ ’ 

_ C" C ( C A' S' \ 

33 _ ~2 ~ ~2 \2C ~ 2A 2B) ' 

9.29 Consider a static, infinitely long, cylindrically symmetric matter distribution of 
constant radius that is invariant to Lorentz boosts along the symmetry axis (a 
‘cosmic string’). Show that the line element outside the body can be written as 

ds 2 — c 2 dt 2 — dp 2 — (a + (ip) 2 defy — dz 2 , 

where a and fi are constants. For the case a — 0, consider the spacelike surfaces 
defined by t — constant and z = constant and calculate the circumference of a circle 
of constant coordinate radius p in such a surface. Hence show that, for /3 < 1, the 
geometry on the spacelike surface is that of a two-dimensional cone embedded in 
three-dimensional Euclidean space. 
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Experimental tests of general relativity 


Most of the experimental tests of general relativity arc based on the Schwarzschild 
geometry in the region r > 2GM/c 2 . Some arc based on the trajectories of massive 
particles and others on photon trajectories. Most of the ‘classic’ tests arc in the 
weak-field limit, but more recent observations have begun to probe the more 
discriminative strong-field regime. We will now discuss both these ‘classic’ exper¬ 
imental tests and some of the more recent findings and proposals. Some later 
tests are in fact more closely linked to the Kerr geometry (see Chapter 13), which 
describes spacetime outside a rotating massive body, but the basic principles can 
still be understood in terms of the simpler Schwarzschild geometry. 


10.1 Precession of planetary orbits 

For a general non-circular orbit in Newtonian theory the equation of motion is 

d 2 u _ GM 
d(j) 2 h 2 

where a = I f r and h is the angular momentum per unit mass of the orbiting 
particle. For a bound orbit, the equation has the solution 

u — —r-(l + ecos(j)), (10-1) 

h- 


which describes an ellipse; the parameter e measures the ellipticity of the orbit. 
Thus, for example, we can draw the orbit of a planet around the Sun as in 
Figure 10.1. We can write the distance of closest approach (perihelion ) as r x = 
n(l — e) and the distance of furthest approach ( aphelion ) as r 2 = a(l + e). The 
equation of motion then requires that the semi-major axis is given by 


h 2 

GM(1 — e 2 ) 


( 10 . 2 ) 
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Aphelion 



Figure 10.1 The elliptical orbit of a planet around the Sun; e is the ellipticity 
of the orbit. 


The general-relativistic equation of motion is 

d 2 u _ GM 3 GM 
dcj) 2 h 2 c 2 


(10.3) 


If the gravitational field is weak, as it is for planetary orbits around the Sun, then 
we expect Newtonian gravity to provide an excellent approximation to the motion 
in general relativity. We can therefore treat the Newtonian solution (10.1) as the 
zeroth-order solution to the general-relativistic equation of motion. Thus let us 
write the general-relativistic solution as 

GM 

u = —— (1 + <?cos0) + Am, 

h l 

where Am is a perturbation. Substituting this expression into the general-relativistic 
equation (10.3) we find that, to first-order in Am, 


d 2 Au 

df> 2 


+ Am = A (l + e 2 cos 2 0 + 2ecos 0), 


where the constant A = 3(G M) 3 / (c 2 h 4 ) is very small. A particular integral of 
this equation is easily found to be 


Am = A [l + 1 


g cos 20) + ef) sin 0], 


(10.4) 


which can be checked by direct differentiation. 

Since the constant A is very small, the first two terms on the right-hand side 
of (10.4) are tiny, and of no use in testing the theory. However, the last term, 
Ae<f sin 0, might be tiny at first but will gradually grow with time, since the factor 
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(f) means that it is cumulative. We must therefore retain it, and so our approximate 
solution reads 

u = [1 + e(cos4> +acf) sin cj))] , (10.5) 

h- 

where a = 3 (GM) 2 /(h 2 c 2 ) <<C 1. Using the relation 

cos [<(>(1 — a )] = cos cos af) + sin cj) sin af> 

~ cos <p + af) sin 0 for a^l, (10.6) 


we can therefore write 


u 


GM 


{1 + ecos[<(>(l — a)]}. 


(10.7) 


From this expression, we see that the orbit is periodic, but with a period 
27 t /(1 — a), i.e. the r-values repeat on a cycle that is larger than 2tt. The result 
is that the orbit cannot ‘close’, and so the ellipse precesses (see Figure 10.2). In 
one revolution, the ellipse will rotate about the focus by an amount 


A(/> = 


2-7T 
1 — a 


— 277 


2jra = 


6ir(GM) 2 

h 2 c 2 


Substituting for h from (10.2), we finally obtain 


A cj) = 


6 t tGM 

a( 1 — e 2 )c 2 


( 10 . 8 ) 



Figure 10.2 Precession of an elliptical orbit (greatly exaggerated). 
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Let us apply equation (10.8) to the orbit of Mercury, which has the following 
parameters: period = 88 days, a = 5.8 x 10 1() m, e = 0.2. Using M 0 = 2 x 10 3 °kg, 
we find 

\<p = 43" per century. 

In fact, the measured precession is 

5599"7±0"4 per century, 

but almost all of this is caused by perturbations from other planets. The residual, 
after taking perturbations into account, is in remarkable agreement with general 
relativity. The residuals for a number of planets (and Icarus, which is a large 
asteroid with a perihelion that lies within the orbit of Mercury) may also be 
calculated (in arcseconds per century): 



Observed residual 

Predicted residual 

Mercury 

43.1 ±0.5 

43.03 

Venus 

8±5 

8.6 

Earth 

5 ± 1 

3.8 

Icarus 

10±1 

10.3 


In each case, the results arc in excellent agreement with the predictions of general 
relativity. Einstein included this calculation regarding Mercury in his 1915 paper 
on general relativity. He had solved one of the major problems of celestial 
mechanics in the very first application of his complicated theory to an empirically 
testable problem. As you can imagine, this gave him tremendous confidence in 
his new theory. 


10.2 The bending of light 


We have already noted that a massive object can have a significant effect on 
the propagation of photons. For example, photons can travel in a circular orbit 
at r — 3 GM/c 1 . We do not, however, expect to observe this effect directly, but 
a more modest bending of light can be observed. For investigating the slight 
deflection of light by, for example, the Sun, it is easiest to follow an approximation 
technique analogous to that used in predicting the perihelion shift of Mercury. 

As we showed in Chapter 9, the ‘shape’ equation for a photon trajectory in the 
equatorial plane of the Schwarzschild geometry is 


d 2 u 3 GM 2 

W +U= ^ ir 


.2 


(10.9) 
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Figure 10.3 Angles and coordinates in the deflection of light by a spherical mass. 


where u = l/r. In the absence of matter, the right-hand side vanishes and we may 
write the solution as 


u = 



( 10 . 10 ) 


which represents a straight-line path with impact parameter b (see Figure 10.3). 
We again treat (10.10) as the zeroth-order solution to the equation of motion. 
Thus we write the general-relativistic solution as 


u = 



T Am, 


where Am is a perturbation. Substituting this expression into (10.9), we find that, 
to first order in Am, 


<5? 2 Am 

dcj> 2 


*f Am — 


3 GM 
c 2 b 2 


This is satisfied by the particular integral 


3 GM, , . 

AM= ^^( 1 + 3 cos2 ^) 

and adding (10.11) into the original solution yields 

3 GM 


sin , 

m = —--1- 

b 


2 c 2 b 2 


(l + | cos2c)>). 


( 10 . 11 ) 


( 10 . 12 ) 


Now consider the limit r —► oo, i.e. u —> 0. Clearly, for a slight deflection we 
can take sincf) ~ 4> and cos 2 4> ~ 1 at infinity, to obtain (b = —2 GM/(c 2 b). Thus 
the total deflection (see Figure 10.3) is 


A <f> = 


4 GM 
c 2 b 


(10.13) 


This is the famous gravitational deflection formula (which incidentally is twice 
what had previously been worked out using a Newtonian approach). For light 
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grazing the Sun it yields Ac p = 1775. The 1919 eclipse expedition led by Eddington 
gave two sets of results: 

Acf> = 1798 ±0716, 

Ac p = 1761 ±074, 

both consistent with the theory. This provided the first experimental verification 
of a prediction of Einstein’s theory (the ‘anomalous’ perihelion shift of Mercury 
had been known for many years) and turned Einstein into a scientific superstar. 1 
Some historians have argued that Eddington ‘fiddled’ the results to agree with the 
theory. If Eddington did indeed massage the results, then he gambled correctly. 
Later high-precision tests using radio sources, which can be observed near the 
Sun even when there is no lunar eclipse, show there is now no doubt that the 
general-relativistic prediction is accurate to a fraction of a percent. Modern radio 
experiments using very long baseline interferometry (VLBI) have been performed 
to measure the gravitational deflection of the positions of radio quasars as they arc 
eclipsed by the Sun. Such experiments can be performed to an accuracy of better 
than ~10" 4 arcseconds. Figure 10.4 summarizes the results of measurements of 
the deflection angle Acb from experiments conducted over the period 1969-75. 
The results arc in excellent agreement with the predictions of general relativity. 
Moreover, as one can see from the figure, the results constrain the parameter co 
in the Brans-Dicke theory of gravity (see Appendix 8A): we have co > 40. 

For more dramatic light deflection, our adopted approach of successive approx¬ 
imations is unsuitable. In this case, it is more appropriate to use the exact equation 
for dcf)/dr derived in Chapter 9, which reads 

df> 11 1 / 2 pc 
dr r 2 _b 2 r 2 \ r 

where b is the impact parameter at infinity. We also showed in Chapter 9 that if 
b > 3\/3/x then the photon is not captured by the mass; the resulting general orbit 
shape is illustrated in Figure 10.5. From the figure, we see that the deflection 
angle is given by 

(10.14) 

where r 0 is the point of closest approach, at which the expression in the square 
brackets in (10.14) vanishes. 




The media had a great story. Remember that this was just after the end of the First World War, and so 
the headlines read something like ‘Newton’s theory of gravity overthrown by German physicist, verified by 
British scientists.’ 
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a 0.88 0.92 0.96 1.00 1.04 1.08 


1969 


1970 


1971 


1972 

1973 

1974 

1975 


Figure 10.4 Results of radio-wave deflection measurements of the positions 
of quasars in the period 1969-75 (from C. Will, Theory and Experiment in 
Gravitational Physics , Cambridge University Press, 1981). The deflection angle 
is A cf> — a4GM/(R e c 2 ) and the error bars are plotted on the parameter a. 
If general relativity is correct, we expect a — I. The abscissa scale gives the 
measured values of the parameter co in the Brans-Dicke scalar-tensor theory of 
gravity, discussed in Appendix 8A. 


Radio deflection experiments 
Muhleman et al. (1970) 
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Hill (1971) i- 
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Stramek (1974) I- 

Riley (1973) 

Weiler et al. (1974) h- 

Counselman et al. (1974) 

Weiler etal (1974) 

Fomalont and Stramek (1975) 
Fomalont and Stramek (1976) 
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Value of scalar-tensor uj 
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Figure 10.5 Angles and coordinates in the deflection of light by a spherical mass. 


10.3 Radar echoes 

In Chapter 9, we showed that the ‘energy’ equation for a photon orbit in the 
Schwarzschild geometry is 



l_2q. 


r 


= c 2 k 2 . 
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Using the result 


/dr\ 2 /dr dt \ 2 k 2 / dr\ 2 

\dTj \dt dr) (1 — 2/Ji/r) 2 \dt) 

we can rewrite the energy equation as 

1 /dr\ 2 h 2 c 2 

(1 — 2/x/r) 3 \ dt) + k 2 r 2 1 — 2/r/r 


(10.15) 


Now consider a photon path from Earth to another planet (say Venus), as shown 
in Figure 10.6. Evidently the photon path will be deflected by the gravitational 
field of the Sun (assuming that the planets are in a configuration like that shown 
in the figure, where the photon has to pass close to the Sun in order to reach 
Venus). Let r 0 be the coordinate distance of closest approach of the photon to the 
Sun;then 


dr\ 
dt) 


r 0 


= 0 , 


and so from (10.15) we have 


hr c 2 

k 2 r% 1-2 fi/r 0 

Thus, after rearrangement, we can write (10.15) as 


Y t =c(l-2/t/r) 


r 2 (l-2/r/r) 
r 2 (l -2/x/r 0 ) 



Figure 10.6 Photon path from Earth to Venus deflected by the Sun. 
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which can be integrated to give for the time taken to travel between points r {) 
and r 

1 f. ^(l-2/x/r) 1 1/_ 


t{r, r 0 ) = f 

J rn 


c( \ -2/i/r) 


r 2 (l -2/r/r 0 ) 


dr. 


The integrand can be expanded to first order in jx/r to obtain 


t(r, r 0 ) = [ 

d To 


2 u 
1 + — + 


pr 0 


c(r 2 -r 2 ) L r r(r+r Q ) J 


dr. 


which can be evaluated to give 

x ( r 2 -?-) 1 / 2 2 /i 

t(r, r 0 ) = -h — In 

c c 


r+(r 2 -r 2 ) 1 / 2 ' 


c 


r + r c 


1/2 


(10.16) 


The first term on the right-hand side is just what we would have expected if 
the light had been travelling in a straight line. The second and third terms give us 
the extra coordinate time taken for the photon to travel along the curved path to 
the point r. So, you can see from Figure 10.6 that if we bounce a radar beam to 
Venus and back then the excess coordinate-time delay over a straight-line path is 


A? = 2 


t(r E ,r 0 ) + t(r w ,r 0 )- 


„2\l/2 


„2\l/2- 


where the factor 2 is included because the photon has to go to Venus and back. 
Since r E r 0 and r v //> r 0 we have 


^) 1/2 


— In I — I + —, 


t(r E , r 0 ) - 


and likewise for t w and r v . Thus, the excess coordinate-time delay is 


At 


4 GM 



Of course, clocks on earth do not measure coordinate time but the corresponding 
proper time. Assuming the Earth to be at fixed coordinates (r, 6, <fi) during the 
travel time of the signal, this is given by 


At = 



2 GM\ 
c 2 r E J 


1/2 

At. 


Flowever, since r E GM/c 2 , we can ignore this effect to the accuracy of our 
calculation. For Venus, when it is opposite to the Earth on the far side of the Sun, 


At ~ 220 p ,s. 
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The idea of the experiment is as follows. Fire an intense radar beam towards 
Venus when it is almost opposite to the Earth on the far side of the Sun and 
measure the time delay of the radar echo with a sensitive radio telescope. The 
excess time delay gives us a test of the principle of equivalence. This sounds 
straightforward, but the time delay is very small and depends on the values of 
r E , r v , /■(,. How can one determine these parameters to the required precision? 
The answer is to fit the measured delays over a long period of time to a curve 
chosen by varying r E , r v , /x etc. as free parameters (see Figure 10.7). There arc a 
number of technical problems that limit the accuracy of this method. Firstly, we 
must correct for the motion of Venus and the Earth in their orbits and for their 
individual gravitational fields. Also, in practice, the radar beam is reflected from 
different points on the surface of Venus (mountain peaks, valleys, etc.) and this 
introduces a dispersion in the time delay of several hundred |xs. This problem 
can be solved by bouncing the radar beams from a mirror - as has since been 
done using the Viking landers on Mars. Another, more complicated, problem is 
correcting for refraction by the Solar corona - this can be important for photon 
paths that graze the surface of the Sun. Nevertheless, Figure 10.7 confirms that the 
corrected measurements are in excellent agreement with the general-relativistic 
prediction. 



Figure 10.7 The Earth-Venus time-delay measurement compared with the 
general-relativistic prediction. 
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10.4 Accretion discs around compact objects 

As we have seen, the orbits of particles and photons arc probes of the geometry of 
spacetime. Information about the geometry produced by compact massive objects 
or black holes can be obtained from observations of the orbits of particles in the 
accretion disc that often surrounds them. As we saw in Chapter 9, the radiation 
efficiency of the accretion disc around a Schwarzschild black hole is 10 times 
greater than the efficiency of the nuclear burning of hydrogen, and such disks are 
very strong emitters in X-rays. 

Even at the temperatures of ~ 10 7 K that characterise an accretion disc, some 
heavy nuclei retain bound electrons. The small trace of iron found in the accreting 
matter is such a nucleus. Incident radiation from X-ray flares above and below 
the disc can lead to fluorescence from the highly ionised atoms in the disc; in 
this process an electron in the atom is de-excited from a higher energy level 
to a lower one and emits a photon. For iron atoms, this results in photons of 
energy 6.4 keV, giving a spectral line roughly in the middle of the X-ray band. 
As one might expect, however, the frequency of the emitted photons as measured 
by some observer at infinity (i.e. an astronomer on Earth) will differ from the 
frequency with which the photons were emitted. Qualitatively, there arc two 
effects that cause this frequency shift. First, the photons will be gravitationally 
redshifted by an amount that depends on the radius from which they were emitted. 
Second, they will be Doppler shifted by an amount that depends on the speed and 
direction (relative to the distant observer) of the material from which they were 
emitted, in particular whether the material was moving towards or away from the 
observer. 

Unfortunately, given the typical size of accretion disks around compact objects, 
and their large distance from us, the angular size of such systems as viewed 
from Earth is typically far smaller than the width of the observing beam of any 
telescope. Thus when an astronomer measures the spectrum (i.e. the photon flux as 
a function of frequency) of such an object, the radiation received at each frequency 
comes from various parts of the disc. Nevertheless, the observed spectrum is seen 
to consist of a much-broadened iron line, whose shape contains information about 
the spacetime geometry around the accreting object. In spite of the integration of 
contributions from across the disc, the photons coming from the inner parts of 
the accretion disc close to the compact object allow one to use the line profile to 
probe the strong-field regime of gravity. 

As an illustration, let us calculate in some simple cases the redshift one would 
expect if the central object were not rotating, so that the geometry outside it is 
given by the Schwarzschild metric. For simplicity, take the disc to be oriented 
edge-on to the observer, as shown in Figure 10.8. All orbits arc then in the plane 
of the observer and the disc, which we take to be the equatorial plane 9 = it 12. 
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Figure 10.8 The emission of a photon by matter in an accretion disc around a 
compact object. The observer is viewing the disc edge-on. 


The ratio of the photon’s frequency at reception to that at emission is given by 

Vr = p(R)-»r = P^ r ) u r ao 

v E P(E)-u e p^(E)u E ' 

where p(E ) and p(R) are the photon 4-momenta at emission and reception respec¬ 
tively, u E is the 4-velocity of the material at emission and u R is the 4-velocity 
of the observer at reception. Assuming the observer to be fixed at infinity, the 
components of his 4-velocity in the ( t , r, 0, <b) coordinate system are 

[<] = (!, 0,0,0). 


Now consider the 4-velocity of the emitting material. Since we are assuming that 
this material is moving in a circular orbit it must have a 4-velocity of the form 

K] = (4,o, o,4). 

Using the fact that 

-i d tj) cl cf) dt d (f> o 

U F = - =-= - Up, 

dr dt dr dt 

we can write the emitter’s 4-velocity as 


[ m £] = 4 ( 1,0,0, a), 


where, for circular motion, El = dtfi/dt = ( GM/r 3 ) 1 / 2 , which we derived in 
Chapter 9. We can now fix u° E by using the fact that g fJLV u fJ 'u v = c 2 . If the emitting 
material is at a coordinate radius r, we have 


u° E = c 



-EL 2 


- 1/2 



Our general expression (10.17) therefore yields 
O? Po(R) 


v E p 0 (E)u° E + pi(E)ul 


L 3/r\ 1/2 p 0 (i?) 

p^(E) 

1± F3V 'el 

V r ) P 0 (E) 

Po( E ) . 
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where the plus sign corresponds to the emitting matter on the side of the disc 
moving towards the observer and the minus sign corresponds to the matter on 
the other side. However, the Schwarzschild metric is stationary, i.e. the metric 
components g /lt , do not depend explicitly on t. Thus p {) is conserved along the 
geodesic, and so 


(■--V' 2 

pAE) 

1±' 3V ’ VL 

V r ) 

Po(E) _ 


It therefore remains only to fix the ratio p^(E) / p 0 (E) in order to determine the 
observed redshift. In general, we must use the fact that the photon worldline is 
null, and so g^p^Pp = 0. As we arc working in the equatorial plane 9 = tt/2. 
this yields 

4 (l - 2 -y) (A)) 2 - (i - v) {Pl)1 ~ 7- {P3)2 = °- (10 ‘ 18) 

For photons emitted from material at a general position angle <:/), one would now 
need to use the geodesic equations for the photon worldline in order to eliminate 
Pi. There are, however, two special cases for which this is not necessary. 

The simplest case occurs when the photon is emitted from matter moving 
transverse to the observer, i.e. when (b = 0 or <b = tt. We then have p 3 (£) = 0, 
and so the observed frequency ratio is 



(10.19) 


The other simple cases occur when the matter is moving either directly towards 
or away from the observer, i.e. when <b = —tt/2 or <b = tt/2. Then the radial 
components of the photon 4-momentum, p ] (E), will be zero. From (10.18) we 
obtain 

P^ E ) = r 

Po(E) c(l — 2/x/r) 1 / 2 ’ 

so that the photon frequency shift for <b = q=7r/2 is given by 


v R= (1 — 3/x/r) 1 / 2 
v E l±(r//x-2)- 1 /2‘ 


( 10 . 20 ) 


The above discussion has been for a disc viewed edge-on. The other limiting case, 
when the disc is viewed face-on, is easier to analyse. Since the motion of the 
emitting matter is always transverse to the observer, the frequency shift is given 
by (10.19). 
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Although the observed iron line consists of photons coming from different 
radii in the disc, we may still calculate the smallest possible frequency (or largest 
redshift) present in the observed spectrum. It is clear that such photons must 
be emitted from the smallest possible value of r. As discussed in the previous 
chapter, the innermost stable circular orbit for the Schwarzschild metric is at 
r = 6 / 1 . Thus the smallest frequency represented is therefore given by 


v r/ v e 


72/3 = 0.47 
1/72 = 0.71 


for a disc viewed edge-on, 
for a disc viewed face-on. 


If the central object were rotating (so that the exterior geometry is given by the 
Kerr metric - see Chapter 13), then the smallest frequencies could be even lower. 
Figure 10.9 shows the iron spectral line measured in the galaxy MCG-6-30-15. 

In general the detailed shape of the line profile depends on the mass and rotation 
of the central object, the inclination of the disc to the line of sight and relativistic 
beaming effects. It is hoped that, in the future, line profiles can be measured to 



Energy (keV) 

Figure 10.9 The line profile of the iron 6.4 keV spectral line from MCG-6-30-15 
observed by the ASCA satellite (Y. Tanaka et al.. Nature 375, 659, 1995). The 
emission line is extremely broad, the width indicating velocities of order one- 
third the speed of light. There is a marked asymmetry towards energies lower 
than the rest energy of the emission line, with a smallest energy of about 4 ke V. 
The solid line shows a fit to the data assuming a disc around a non-rotating 
Schwarzschild black hole, extending between 3 and 10 Schwarzschild radii and 
inclined at an angle of 30° to the line of sight. Certain features suggest that the 
central object may in fact be rapidly rotating. 
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sufficient accuracy to determine the mass and angular momentum of the central 
compact object. 


10.5 The geodesic precession of gyroscopes 


We have seen how the motion of test bodies can be used to explore the geometry 
of a curved spacetime. If the test body has spin then the motion of its spin vector 
can also be used to probe the spacetime geometry. Here we discuss the idealised 
case of an infinitesimally small test body with spin, such as a small gyroscope. 

The test body moves along a timelike geodesic curve, so its 4-velocity m(t) 
is parallel-transported along its worldline. Thus, in some coordinate system, its 
components satisfy 


du^ 

dr 


+ P\ 


= 0 . 


Suppose that the spin of the test body is described by the 4-vector s(r) along the 
geodesic. Since this vector can have no timelike component in the instantaneous 
rest frame of the test body, we require that at all points along the geodesic 


s-u = g fJiV s fJ 'u v = 0 . ( 10 . 21 ) 

Since the 4-velocity u of the test body is parallel-transported along its geodesic, 
to ensure that the inner product is conserved at all points along the worldline 
we require that s(r) is also parallel-transported along the geodesic. Hence its 
components must satisfy 

d 

— + T» V(T s v u a = 0 . (10.22) 

<7T 

Let us now suppose that the test body is in a circular orbit of coordinate radius 
r in the equatorial plane of the Schwarzschild geometry. Using the expressions we 
derived in Chapter 9 for the connection coefficients T^ vo . for the Schwarzschild 
metric in ( t , r, 6, (b) coordinates (with 9 = tt/ 2), one finds that most of the P w 
arc zero. Moreover, for a test body in a circular orbit we have u l = u 2 = 0, and 
we find that the equations ( 10 . 22 ) reduce to 

= 0, (10.23) 

= 0, (10.24) 

= 0, (10.25) 

= 0 , (10.26) 


d^ 

-— + r° l0 s l u° 
dr 

^+r> m s V+rW 

dT 

ds 2 
dr 


ds 3 
dr 


+ r 3 n .rrr 
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where the connection coefficients arc given by 



Moreover, from our discussion in the previous section, we can write the test 
body’s 4-velocity as 

K] = m°( 1,0,0, 22), 

where u° = dt/dr = (1 — 3/x/r) -1 / 2 and 22 = def/dt = (/xc 2 /r 3 ) 1 / 2 are both 
constants. 

Since u l = u 2 = 0 the orthogonality condition (10.21) reduces to 


c 2 (l —2 p/r)s°u Q — r 2 s 3 u 3 = 0 


and noting that u 3 /u° = 12 we may express ,v° in terms of s 3 : 

„ Or 2 


c 2 (l — 2/x/r) 


Using this result it is straightforward to show that equation (10.23) is equivalent 
to equation (10.26). Thus the system of equations reduces to 


ds 2 rfl , 

- = 0 , 

ch "° 


ds 2 

— = 0 , 
dr 


ds 3 u°n , 

— +-s 1 = 0. 

dr 


(10.27) 


It is more convenient to convert the T-derivatives to /-derivatives using u° = 
dt/dT. Then, on using the third equation to eliminate s 3 from the first, the system 
of equations becomes 


d 2 s l 
dt 2 


+ 




ds 3 n , 

— + -.v 1 = 0. 
dt r 


Let us take the initial spatial direction ~s of the spin vector to be radial, so that 
5 2 (0) = U(0) = 0. The corresponding solution to our system of equations is easily 
shown to be 


^(f) = y 1 (0) cos UT, i 2 (r) = 0, 


s 3 (t) = -y 1 (0) sin fl't, (10.28) 

riV 


where O' = fl/u° = 22(1 — 3/x/r) 1 / 2 . This solution shows that the spatial paid 
s of the spin vector rotates relative to the radial direction with a coordinate 
angular speed 22' in the negative (^-direction. However, the radial direction itself 
rotates with coordinate angular speed 22 in the positive (^-direction, and it is the 
difference between these two speeds that gives rise to the geodesic precession 
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Radial 



Figure 10.10 The geodesic precession effect for a spinning object in a circular 
orbit in the equatorial plane of the Schwarzschild geometry. Here the initial 
direction (t — 0) is radial. 

effect. This is illustrated in Figure 10.10. Since one revolution is completed in 
a coordinate time t = 2tt/£1, the final direction of 3 is therefore 2-77 + a, where 
a = (27r/0)(O — O'). Thus, after one revolution the spatial spin vector is rotated 
in the direction of the orbital motion by an angle 

a = 2ir[l — (1 — 3/x/r) 1 / 2 ]. 

The geodesic precession effect may be observable experimentally by measuring 
the spacelike spin vector of a gyroscope in an orbiting spacecraft. Although the 
effect is small, it is cumulative. Thus, for a gyroscope in a near-Earth orbit, the 
precession rate is about 8" per year, which should be measurable. (In fact there is 
an additional very small effect, which may also be measurable, due to the fact that 
the Earth is slowly rotating and so the geometry outside it is correctly described 
by the Kerr metric). In April 2004, NASA launched the Gravity Probe B (GP-B) 
satellite to carry out this experiment and it is currently recording measurements; 
the results are eagerly awaited. 


Exercises 

10.1 Show that the equation of motion for planetary orbits in Newtonian gravity is 

d 2 u GM 

dcf> 2 h 2 

where u— 1 /r and r is the radial distance from the centre of mass of the central 
object. 
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10.2 


10.3 


Show that the equation of motion in Exercise 10.1 has the solution 
GM 


h 2 


(1 + ecos ( 


and that this describes an ellipse. Show further that 

h 2 


GM{\ — e 2 ) ’ 


where — a(l — e) and r 2 — a(l + e) are the distances of closest and furthest 
approach respectively. 

Verify that the general-relativistic equation of motion for planetary orbits (10.3) 
has the solution 


GM 

u— —-(1 + ccos 4>) + 
h z 


3 (GM) 2 
c 2 h A 




+ e<f) sin 4> 


10.4 


to first order in the relativistic perturbation to the Newtonian solution. 

Verify that the general-relativistic equation of motion for a photon trajectory (10.9) 
has the solution 


u = 


sincj) 3 GM 
b 2 c 2 b 2 


1 + - cos 2(f> ), 


10.5 

10.6 


to first order in the relativistic perturbation to a straight-line path. 

Show that the gravitational deflection of light by the Sun predicted in the Newto¬ 
nian theory of gravity is exactly half the value predicted in general relativity. 

In the radar-echoes test, a photon travels from radial coordinate r to r 0 , which is 
the radial coordinate of the closest approach of the photon to the Sun. Verify that, 
to first order in fx/r, the elapsed coordinate time is given by 


t(r, r 0 ) = 


0 r 2 -r 2 Y 


2u 

-In 


r+(r 2 





10.7 An accretion disc extends from r — 6/a to r — 20/a in the equatorial plane of the 
Schwarzschild geometry. A photon is emitted radially outwards by a particle on 
the inner edge of the disc and is absorbed by a particle on the outer edge of the 
disc. Find the ratio of the energy absorbed to that emitted. 

10.8 For a gyroscope in a circular orbit in the equatorial plane of the Schwarzschild 
geometry, show that the components of its spin 4-vector satisfy equa¬ 
tions (10.23-10.26). 

10.9 Show that the system of equations (10.27) has the solution (10.28). 

10.10 A gyroscope in a circular orbit of radius r in the equatorial plane of the 
Schwarzschild geometry has its spatial spin vector s also lying in the equatorial 
plane. Show that, after one complete orbit, the angle between the initial and final 
directions of the spatial spin vector is given by 


a — 2i t [l — (1 — 3/A/r) 1 / 2 ] , 


irrespective of the initial direction of the spin vector. Does this still hold if the 
original spatial spin vector does not lie in the plane of the orbit? 
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Schwarzschild black holes 


In our discussion of the Schwarzschild geometry, we have thus far used the 
coordinates (t, r, 9, (f>) to label events in the spacetime. In this context, (t, r, 9, <:/)) 
arc called the Schwarzschild coordinates. Moreover, until now we have been 
concerned only with the exterior region r > 2/jl. We now turn to the discussion 
of the Schwarzschild geometry in the interior region r < 2/x, and the significance 
of the hypersurface r = 2/x. We shall see that, in order to understand the entire 
Schwarzschild geometry, we must relabel the events in spacetime using different 
sets of coordinates. 


11.1 The characterisation of coordinates 

Before discussing the Schwarzschild geometry in detail, let us briefly consider the 
characterisation of coordinates. In general, if we wish to write down a solution 
of Einstein’s field equations then we need to do so in some particular' coordinate 
system. But what, if any, is the significance of any such system? For example, 
suppose we take the Schwarzschild solution and apply some complicated coor¬ 
dinate transformation x 11 —> x ,J -. The resulting metric will still be a solution of 
the empty-space field equations, of course, but there is likely to be little or no 
physical or geometrical significance attached to the new coordinates x' 11 . 

One thing we can do, however, is to establish whether at some event P a 
coordinate x 11 is timelike, null or spacelike. This corresponds directly to the nature 
of the tangent vector e to the coordinate curve at P. The easiest way to determine 
this property of the coordinate is to fix the other coordinates at their values at P 
and consider an infinitesimal variation dx 11 in the coordinate of interest. If the 
corresponding change in the interval ds 2 is positive, zero or negative then x 11 is 
timelike, null or spacelike respectively. This, in turn, corresponds simply to the 
sign of the relevant diagonal element g p/1 (no sum) of the metric. 
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11.2 Singularities in the Schwarzschild metric 

With these ideas in mind, let us look at the Schwarzschild metric in the traditional 
(t, r, 6, <$>) coordinate system. We have 


ds 2 = c 2 



2 GM\ 
c 2 r ) 


dt 2 



2GMY~ X 
c 2 r ) 


dr 2 — r 2 dd 2 — r 2 sin 2 ddfr. (11.1) 


Inspection of this line element shows immediately that the metric is singular at 
r — 0 and r = 2 GM/c 2 . The latter value is known as the Schwarzschild radius 
and is often denoted r s , so that 


r s = 


2GM 


We must remember, however, that we derived the Schwarzschild solution by 
solving the vacuum field equations R ]1V = 0, and so the metric given by (11.1) is 
only valid down to the surface of the spherical matter distribution. For example, 
the Schwarzschild radius for the Sun is 


r s 


2GM q 


2.95 km, 


which is much smaller than the radius of the Sun (R Q = 7 x 10 5 km). Similarly, 
the Schwarzschild radius for a proton is 


r s 


2GM p 


= io- 52 


m, 


again much smaller than the characteristic radius of a proton (R p = 10 -15 m). In 
fact, for most real objects the Schwarzschild radius lies deep within the object, 
where the vacuum field equations do not apply. But what if there exist objects so 
compact that they lie well within the Schwarzschild radius? For such an object, the 
Schwarzschild solution looks very odd. Ignoring for the moment the singularity 
in the metric at r = r s , let us denote the region r>r s as region I, and r < r s as 
region II. 

From the Schwarzschild metric (11.1) we see that, in region I, the metric 
coefficient g 00 is positive and the g u (for i— 1,2,3) arc negative. It therefore 
follows that for r > r s the coordinate t is timelike and the coordinates r, 9, <fi 
are spacelike. Indeed, in region I we may attach simple physical meanings to the 
coordinates. For example, t is the proper time measured by an observer at rest at 
infinity. Similarly, r is a radial coordinate with the property that the surface area 
of a 2-sphere t = constant, r = constant is 4 nr 2 . In region II, however, the metric 
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coefficients g 00 and g n change sign. Hence, for r < r s , t is a spacelike coordinate 
and r is timelike. Thus ‘time’ and ‘radial’ coordinates swap character on either 
side of r — r s . It is natural to ask what this means, and, indeed, whether it is 
physically meaningful. 

Let us therefore consider in more detail the singularities in the metric at 
r = 0 and r — r s . We must remember that coordinates arc simply a way of 
labelling events in spacetime. The physically meaningful geometric quantites arc 
the 4-tensors defined at any point on the spacetime manifold. Spacetime curvature 
is described covariantly by the components of the curvature tensor R jlvp(r (and its 
contractions), which we may easily calculate for the Schwarzschild metric (11.1). 
For example, the curvature scalar at any point is given by 

= (U.2) 


which we see is finite at r — r s . Moreover, since it is a scalar, its value remains 
the same in all coordinate systems. Thus the spacetime curvature at r = r s is 
perfectly well behaved, and so we see that r = r s is a coordinate singularity. By 
the same token, (11.2) is singular at r = 0 and so this point is a true intrinsic 
singularity of the Schwarzschild geometry. 

We may illustrate the idea of coordinate singularities with a simple example. 
As discussed in Chapter 2, one may write the line element for the surface of a 
2-sphere as 


ds 2 



+ p 2 dcj) 2 . 


This line element has a singularity at p = a. Embedding this manifold, for the 
moment, in three-dimensional Euclidean space, we know that p = a corresponds 
simply to the equator of the sphere (relative to the origin of the coordinate system) 
and it is clear why the (p, </>) coordinates cover the surface of the sphere uniquely 
only up to this point. There is nothing pathological occurring in the intrinsic geom¬ 
etry of the 2-sphere at the equator, i.e. there is no ‘real’ (or intrinsic) singularity 
in the metric. As shown in Appendix 7A, the Gaussian curvature of a 2-sphere 
is simply K = 1 /a 1 , which does not ‘blow up’ anywhere. Thus, p = a is only a 
coordinate singularity , which has resulted simply from choosing coordinates with 
a restricted domain of validity. In an analogous way, the coordinate singularity 
of the Schwarzschild metric is simply a result of the coordinate system that we 
have chosen to use. We can remove it by making appropriate transformations 
of coordinates, which we will discuss later. For the time being, however, let us 
continue our investigation of the Schwarzschild geometry using the Schwarzschild 
coordinates (t, r, 6, <b). 
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11.3 Radial photon worldlines in Schwarzschild coordinates 

Let us investigate the spacetime diagram of the Schwarzschild solution in 
(t, r, 6, (b) coordinates. The metric reads 

ds 2 = c 2 ^1-—^ dt 2 — ^1-—^ dr 2 — r 2 dfl 2 , 

where dCl is an element of solid angle. We have written it in this form because 
we shall usually ignore the angular coordinates in drawing spacetime diagrams, 
i.e. these diagrams will show the (r, ct)-plane for fixed values of 6 and <b. 

We begin by determining the lightcone structure in the diagram, by considering 
the paths of radially incoming and outgoing photons; these were discussed briefly 
in Section 9.11. From the metric, for a radially moving photon we have 

- = ±- * 
dr c \ r J 

where the plus sign corresponds to a photon that is outgoing (in that dr/dt is 
positive in the region r > 2/x) and the minus sign corresponds to a photon that 
is incoming (in that dr/dt is negative in the region r > 2/x). On integrating, we 
obtain 


ct — r + 2/x In 
ct = —r — 2/x In 


-1 

2/x 


+ constant 


-1 

2/x 


+ constant 


(outgoing photon), 
(incoming photon). 


Notice that under the transformation t —»■ —t the incoming and outgoing photon 
paths arc interchanged, as we would expect. We can now plot these curves in the 
(r, ct)-plane, as shown in Figure 11.1. The diagram is drawn for fixed 6 and cb. 
Since the diagram will be the same for all other 6 and <b, we should think of each 
point (r, ct) in the diagram as representing a 2-sphere of area 4 ttt 2 . 

Figure 11.1 requires some words of explanation. At large radii in region I the 
gravitational field becomes weak and the metric tends to the Minkowski metric 
of special relativity. Thus, as expected, the lightcone structure becomes that of 
Minkowski spacetime, where incoming and outgoing light rays define straight 
lines of slope ±1 in the diagram. As we approach the Schwarzschild radius, the 
ingoing light rays tend to the ordinate t — > +oo and outgoing light rays tend to 
t — > —oo. This seems to suggest that it takes an infinite time for an incoming signal 
to cross the Schwarzschild radius, but in this respect the diagram is misleading, 
as we shall see shortly (we discussed this point briefly in Section 9.11). 
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Figure 11.1 Lightcone structure of the Schwarzschild solution. 


In region II the lightcones flip their orientation by 90°, since the coordinates 
t and r reverse their character. We see that all photons in this region must end 
up at r = 0. At this point there is real singularity, where the curvature of the 
Schwarzschild solution diverges. Moreover, any massive particle in region II 
must also end up at the singularity, since a timelike worldline must lie within 
the forward light-cone at each point. Thus we conclude that once within the 
Schwarzschild radius you necessarily end up at a spacelike singularity at r = 0. 
To escape would require a violation of causality. 


11.4 Radial particle worldlines in Schwarzschild coordinates 

The causal structure in Figure 11.1 is determined by radially moving photons. 
It is also of interest to determine the worldlines of radially moving massive 
particles in Schwarzschild coordinates. For simplicity let us consider an infalling 
particle released from rest at infinity, which we investigated in detail in Chapter 9. 
Parameterising the particle worldline in terms of the proper time t, we found that 
the trajectory r(r) could be written implicitly as 


T 


2/T_2 PL 

3 y 2/rc 2 3 y 2 pLc 2 ’ 


(11.3) 
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taking t = 0 at r = r 0 . Alternatively, if the trajectory is described as r(t), where 
t is the coordinate time, we found that 


t 




s/ r / (2/x) +1 

\/ r /(2p) - 1 



\ / yr 0 /(2/r)-l \ 


(11.4) 


where t — 0 at r — r Q . Using equations (11.3) and (11.4), we can associate a given 
value of the particle’s proper time t with a point in a (r, ct)-diagram. Thus, as t 
increases, we can plot out the particle trajectory in the (r, cf)-plane. 


ctlf-l 



Figure 11.2 Trajectory of a radially infalling particle released from rest at 
infinity. The dots correspond to unit intervals of cr/jx, where r is the particle’s 
proper time and we have taken r = t = 0 at r 0 — 8/r. 
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The corresponding curve is shown in Figure 11.2, which is a more quantitative 
version of Figure 9.6; we have taken t = t = 0 at r 0 = 8/x. Also plotted arc dots 
showing unit intervals of ct//jl, together with the light-cone structure at particular 
points on the trajectory. We see from the plot that the particle worldline has a 
singularity at r = 2/jl and that it takes an infinite coordinate time t for the particle 
to travel from r = 8/x to r = 2/jl. Since t is the time experienced by a stationary 
observer at large radius, to such an observer it thus takes an infinite time for the 
particle to reach r = 2/jl. However, the proper time taken by the particle to reach 
r = 2/jl is finite (t = 9.33/x/c). Moreover, we see that for later values of t the 
particle worldline lies in the region r < 2/jl, which was not plotted in Figure 9.6. 
In this region the coordinates t and r swap character, as indicated by the fact 
that the light-cone is flipped by 90°. For r < 2/jl, we also note that, although t 
continues to increase until r = 0 is reached (r = 10.67 /t/c), the coordinate time 
t decreases along the particle worldline. 

Clearly, although the coordinate t is useful and physically meaningful as r —»■ oo, 
it is inappropriate for describing particle motion at r < 2/jl. Therefore, in the 
following section we introduce a new time coordinate that is adapted to describing 
radial infall, and in the process we shall remove the coordinate singularity at 
r = 2/jl. 


11.5 Eddington-Finkelstein coordinates 

The spacetime diagrams in Figures 11.1 and 11.2 show that the worldlines of 
both radially moving photons and massive particles cross r = 2/jl only at t = ±oo. 
This suggests that the ‘line’ r = 2/jl, — oo < t < oo might really not be a line at all, 
but a single point. That is, our coordinates may go bad owing to the expansion 
of a single event into the whole line r = 2/jl. One technique for circumventing 
the problem of unsatisfactory coordinates is to ‘probe’ spacetime with geodesics, 
which after all are coordinate independent and will not be affected in any way by 
the boundaries of coordinate validity. Of the many possibilities, we will use as 
probes the null worldlines of radially moving photons. 1 

Advanced Eddington-Finkelstein coordinates 
Since in particular, we wish, to develop a better description of infalling particles, 
let us begin by constructing a new coordinate system based on radially infalling 


1 It is also possible to use the timelike worldlines of freely falling radially moving massive particles as probes of 
the spacetime geometry. The traditional approach leads to useful new coordinates, called Novikov coordinates, 
but they are related to Schwarzschild coordinates by transformations that are algebraically very complicated. 
A more physically meaningful set of new coordinates that are also based on radially moving massive particle 
geodesics is discussed in Exercise 11.9. 
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photons. Recall that the worldline of a radially ingoing photon is given by 


ct = —r — 2/xln 



+ constant. 


The trick is to use the integration constant as the new coordinate, which we 
denote by p. Thus, we make the coordinate transformation 


p = ct + r + 2/rln 



(11.5) 


where p, for historical reasons, is known as the advanced time parameter and is 
clearly a null coordinate (see Section 11.1). Since p is constant along the entire 
worldline of the radially ingoing photon, it will be a ‘good’ coordinate wherever 
that worldline penetrates. 

Differentiating (11.5), we obtain 

r 

dp = c dt-\ --— dr, 

r — 2 p 

and, on substituting for dt in the Schwarzschild line element, we find that in 
terms of the parameter p the line element takes the simple form 


ds 2 



dp 2 —2 dpdr—r 2 (dd 2 + sin 2 ddcj) 2 ). 


( 11 . 6 ) 


We see immediately from (11.6) that ds 2 is now regular at r = 2 pc, indeed it is 
regular for the whole range 0 < r < oo, which is the range of r -values probed by 
an infalling photon geodesic. Thus, in some sense, the transformation (11.5) has 
extended the coordinate range of the solution in a way reminiscent of the analytic 
continuation of a complex function. 

One might object that the coordinate transformation (11.5) cannot be used at 
r = 2p because it becomes singular. This must happen, however, if one is to 
remove the coordinate singularity there. In any case, this transformation takes the 
standard form (11.1) for the Schwarzschild line element to the form (11.6). Given 
these two solutions, we can simply ask, what is the largest range of coordinates 
for which each solution is regular? For the standard form this is 2/x < r < oo, 
whereas for the new form (11.6) it is 0 < r < oo. In the overlap region 2p < r < oo 
the two solutions are related by (11.5), and hence they must represent the same 
solution in this region. 
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As one might expect, the metric (11.6) is especially convenient for calculating 
the paths of null geodesics. In particular, we see that radial null geodesics (for 
which ds = dd = d(j) = 0) are given by 



( — 2 — 

\dr J dr 


= 0 , 


which has the two solutions 



dp 

dr 


= 2 



p = constant, 


p = 2r + 4/xln 



+ constant, 


(11.7) 


which correspond to incoming and outgoing radial null geodesics respectively 
(the former being valid by construction). 

Since p is a null coordinate, which might be intuitively unfamiliar, it is common 
practice to work instead with the related timelike coordinate t', defined by 


ct' = p — r = ct + 2 pi In 



The line element then takes the form 


( 11 . 8 ) 



(11.9) 


which is again regular for the whole range 0 < r < oo. The coordinates it', r, 6, (f>) 
arc called advanced Eddington-Finkelstein coordinates. We note that the line 
element (11.9) is not invariant with respect to the transformation t' —> —t', under 
which the second term on the right-hand side changes sign. From (11.7), we see 
that incoming and outgoing photon worldlines arc given by 


ct' = — r +constant, 


ct' = r + 4 pt In 


-1 

2 pc 


+ constant. 


( 11 . 10 ) 

( 11 . 11 ) 


The first equation, for ingoing photons, corresponds to a straight line making 
an angle of 45° with the r-axis and is valid for 0 < r < oo. Thus the photon 
geodesics arc continuous straight lines across r = 2/x. The spacetime diagram 




11.5 Eddington-Finkelstein coordinates 


257 



Figure 11.3 Lightcone structure in advanced Eddington-Finkelstein coordinates. 


of the Schwarzschild geometry in advanced Eddington-Finkelstein coordinates is 
shown in Figure 11.3. 

The spacetime diagram now appears more sensible. It is straightforward to see 
that the radial trajectory of an infalling particle or photon is continuous at the 
Schwarzschild radius r = 2/x. The lightcone structure changes at the Schwarzschild 
radius and, as you can see from the diagram, once you have crossed the boundary 
/- = 2/x your future is directed towards the singularity. Similarly, it can be seen 
that a photon (or particle) starting at r < 2/x cannot escape to the region r > 2/x. 
The Schwarzschild radius r = 2/x defines an event horizon , a boundary of no 
return. Once a particle crosses the event horizon it must fall to the singularity 
at r = 0. Moreover, from the paths of the ‘outgoing’ null geodesics, we see 
that any photons emitted by the infalling particle at r < 2/x will not reach an 
observer in region I. Thus to such an observer the particle appears never to cross 
the event horizon. A compact object that has an event horizon is called a black 
hole. 


Retarded Eddington-Finkelstein coordinates 
One may reasonably ask what occurs if one instead chooses to construct a 
new coordinate system based on the worldlines of radially outgoing photons. 
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By analogy with our discussion above, this is achieved straightforwardly by intro¬ 
ducing the new null coordinate q defined by 


q = ct — r — 2/x in 


-1 

2 /x 


which is known as the retarded time parameter. The line element of the 
Schwarzschild geometry then becomes 

1-— ^ clq~ +2 dq dr — r 2 (dd 2 + sin 2 6 dcjr). 



which is again regular for ()</-< oc. Similarly, it is common practice to introduce 
a new timelike coordinate t* defined by 


ct* = q + r = ct — 2 /x in 



The coordinates (t*,r, 6, <fr) arc called retarded Eddington-Finkelstein coordi¬ 
nates, and the corresponding line element in these coordinates is simply the time 
reversal of the advanced Eddington-Finkelstein line element (11.9). 

It is straightforward to draw an spacetime diagram analogous to Figure 11.3 in 
retarded Eddington-Finkelstein coordinates, and one finds that (by construction) 
the outgoing radial null geodesics arc continuous straight lines at 45° but the 
ingoing null rays arc discontinuous, tending to t* = +oo at r — 2/x. In this case, 
the surface r = 2/x again acts as a one-way membrane, but this time letting only 
outgoing timelike or null geodesics cross from inside to outside. Indeed, particles 
must move away from the singularity at r = 0 and arc forcibly expelled from the 
region r < 2/x. Such an object is called a white hole. 

This behaviour appears completely at odds with our intuition regarding the 
gravitational attraction of a massive body. Moreover, how can the physical 
processes that occur be so radically different depending on one’s choice of coordi¬ 
nates, since we have maintained throughout that coordinates arc merely arbitrary 
labels of spacetime events? The key to resolving this apparent paradox is to 
realise that our original coordinates (t, r, 6, <b) covered only a part of the ‘full’ 
Schwarzschild geometry. This topic is discussed fully in Section 11.9, in which we 
introduce Kruskal coordinates, which cover the entire geometry and which show 
that it possesses both a black-hole and a white-hole singularity. The advanced 
Eddington-Finkelstein coordinates ‘extend’ the solution into the (more familial - ) 
paid of the manifold that constitutes a black hole, whereas the retarded Eddington- 
Finkelstein coordinates extend the solution into a different paid of the manifold, 
corresponding to a white hole. As we will discuss in Section 11.9, the existence 
of white holes as a physical reality (as opposed to a mathematical curiosity) is 
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rather doubtful. Black holes, however, are likely to occur physically, as we now 
go on to discuss. 


11.6 Gravitational collapse and black-hole formation 

Our investigation of the properties of a black hole would be largely academic 
unless there were reasons for believing that they might exist in Nature. The 
possibility of their existence arises from the idea of gravitational collapse. 

A star is held up by a mixture of gas and radiation pressure, the relative 
contributions depending on its mass. The energy to provide this pressure support is 
derived from the fusion of light nuclei into heavier ones, predominantly hydrogen 
into helium, which releases about 26 MeV for each atom of He that is formed. 
When all the nuclear fuel is used up, however, the star begins to cool and collapse 
under its own gravity. For most stars, the collapse ends in a high-density stellar 
remnant known as a white dwarf. In fact, we expect that in around 5 billion years 
the Sun will collapse to a form a white dwarf with a radius of about 5000 km and 
a spectacularly high mean density of about 10 9 kgm 3 . 

Astronomers have known about white dwarfs since as long ago as 1915 (the 
earliest example being the companion to the bright star Sirius, known as Sirius 
B), but nobody at the time knew how to explain them. The physical mechanism 
providing the internal pressure to support such a dense object was a mystery. The 
answer had to await the development of quantum mechanics and the formulation 
of Fermi-Dirac statistics. Fowler realised in 1926 that white dwarfs were held 
up by electron degeneracy pressure. The electrons in a white dwarf behave like 
the free electrons in a metal, but the electron states arc widely spaced in energy 
because of the small size of the star in its white-dwarf form. Because of the 
Pauli exclusion principle, the electrons completely fill these states up to a high 
characteristic Fermi energy. It is these high electron energies that save the star 
from collapse. 

In 1930, Chandrasekhar realized that the more massive a white dwarf, the denser 
it must be and so the stronger the gravitational field. For white dwarfs over a 
critical mass of about 1.4 M Q (now called the Chandrasekhar limit), gravity would 
overwhelm the degeneracy pressure and no stable solution would be possible. 
Thus, the gravitational collapse of the object must continue. At first it was thought 
that the white dwarf must collapse to a point. After the discovery of the neutron, 
however, it was realized that at some stage in the collapse the extremely high 
densities occurring would cause the electrons to interact with the protons via 
inverse /3-decay to form neutrons (and neutrinos, which simply escape). A new 
stable configuration - a neutron star - was therefore possible in which the pressure 
support is provided by degenerate neutrons. A neutron star of one solar mass 
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would have a radius of only 30km, with a density of around 10 16 kgnG 3 . Since 
the matter in a neutron star is at nuclear density, the gravitational forces inside the 
star arc extremely strong. In fact, it is the first point in the evolution of a stellar 
object at which general relativistic effects arc expected to be important (we will 
discuss relativistic stars in Chapter 12). 

Given the extreme densities inside a neutron star, there remain uncertainties 
in the equation of state of matter. Nevertheless, it is believed that (as for white 
dwarfs), there exists a maximum mass above which no stable neutron-star config¬ 
uration is possible. This maximum mass is believed to be about 3 M Q (which 
is known as the Oppenheimer-Volkoff limit). Thus, we believe that stars more 
massive than this limit should collapse to form black holes. Moreover if the 
collapse is spherically symmetric then it must produce a Schwarzschild black 
hole. 

Some theorists were very sceptical about the formation of black holes. The 
Schwarzschild solution in particular is very special - it is exactly spherically 
symmetric by construction. In reality, a star will not be perfectly symmetric 
and so perhaps, as it collapses, the asymmetries will amplify and avoid the 
formation of an event horizon. In the early 1960s, however, Penrose applied global 
geometrical techniques to prove a famous series of ‘singularity theorems’. These 
showed that in realistic situations an event horizon (a closed trapped surface) 
will be formed and that there must exist a singularity within this surface, i.e. a 
point at which the curvature diverges and general relativity ceases to be valid. 
The singularity theorems were important in convincing people that black holes 
must form in Nature. In Appendices 11A and 11B, we discuss some of the 
observational evidence for the existence of black holes. As we will see, there is 
compelling evidence that black holes do indeed exist. Furthermore, as mentioned 
in Section 10.4, it should become possible within the next few years not only to 
measure the masses of black holes but also to measure their angular - momenta, 
using powerful X-ray telescopes! Direct experimental probes of the strong-gravity 
regime are now possible. 


11.7 Spherically symmetric collapse of dust 

Let us consider the spherically symmetric collapse of a massive star to form a 
Schwarzschild black hole and also the view this process seen by a stationary 
observer at large radius. For simplicity, we consider the case in which the star - has 
a uniform density and the internal pressure is assumed to be zero. In the absence 
of pressure gradients to deflect their motion, the particles on the outer surface of 
this ‘ball of dust’ will simply follow radial geodesics. In order to simplify our 
analysis still further, we will assume that initially the surface of the ‘star - ’ is at 
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rest at infinity. 2 In this case, the particles on the surface will follow the radial 
geodesics we discussed earlier. 

Consider two observers participating in the gravitational collapse of the spher¬ 
ical star. One observer rides the surface of the star down to r = 0, and the other 
observer remains fixed at a large radius. Moreover, suppose that the infalling 
observer carries a clock and communicates with the distant one by sending out 
radial light signals at equal intervals according to this clock. Figure 11.4 shows 
the relevant spacetime diagram in advanced Eddington-Finkelstein coordinates 
(cf , r), with 6 and <J) suppressed. The dots denote unit intervals of ctf/x and we 
have chosen t = t' = 0 at r = 8/x. This diagram is easily constructed from the 
results that were used to obtain Figure 11.2. 

For a distant observer at fixed r, we know that the standard Schwarzschild 
coordinate time t measures proper time. From (11.8), however, we see that if r is 
fixed then dt' = clt. Thus, a unit interval of t' corresponds to a unit interval of 


Ct'/fl 



Figure 11.4 Collapse of the surface of a pressureless star to form a black hole 
in advanced Eddington-Finkelstein coordinates. The star’s surface started at rest 
at infinity, and we have chosen r = t' — 0 at r = 8/x. 


This is equivalent to the collapse commencing with the star’s surface at some finite radius r = r 0 with some 
finite inwards velocity. 
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proper time for a distant fixed observer. From the diagram, we see that the light 
pulses arc not received at equal intervals of t'. Rather, the proper time interval 
measured by the distant observer between each pulse steadily increases. Indeed, 
the last light pulse to reach this observer is the one emitted just before the surface 
of the star crosses r = 2/x. The worldine of this photon is simply the vertical line 
r = 2/x, and so this pulse would only ‘reach’ the distant observer at t = oo. Pulses 
emitted after the surface of the star has crossed the event horizon do not progress 
to larger r but instead progress to smaller r and end up at the singularity at r = 0. 

Thus, the distant observer never sees the star’s surface cross the radius r = 2/x. 
Furthermore, the pulses emitted at equal intervals by the falling observer’s clock 
arrive at the distant observer at increasingly longer intervals. Correspondingly, the 
photons received by the distant observer arc increasingly redshifted, the redshift 
tending to infinity as the star’s surface approaches r = 2/x. Both these effects 
mean that the distant observer sees the luminosity of the star fall to zero. To 
summarise, the distant observer sees the collapse slow down and the star’s state 
approach that of a quasi-equilibrium object with radius r = 2/x, which eventually 
becomes totally dark. Thus, the distant observer sees the formation of a black hole. 

Let us quantify further what the observer sees as the star collapses to form 
a black hole. Since we arc interested in measurements made by a distant 
fixed observer, we may use either advanced Eddington-Finkelstein coordinates 
if , r, 6, <:/)) or traditional Scharwzschild coordinates (t,r, 6 <fi), as both correspond 
to physical quantities at large r. We shall use the latter simply because we 
have already obtained the equations for a massive radially infalling particle in 
Schwarzschild coordinates. Suppose that a particle on the surface of the star emits 
a radially outgoing pulse of light at coordinates (t E , r E ), which is received by the 
distant fixed observer at (t R , r R ). Since the photon follows a radially outgoing 
null geodesic, we can write 


ct E — r E — 2/x In 



= ct R — r R — 2/x In 



( 11 . 12 ) 


The radial coordinate ‘seen’ by the distant observer at time t R is the function 
r E (t R ) obtained by solving (11.12). Using the fact that the coordinates t E and r E of 
the freely falling emitter arc related by (11.4), we find that, if r is very close to 2/x, 


r E (l r) ~ 2/x + a exp 



(11.13) 


where a is an unimportant constant depending on /x and r R . The important 
consequence of this result is that the radius r — 2/x is approached exponentially, 
as seen by the distant observer, with a characteristic time 4/x/c. Since 


/x 

c 


GM 


= 5 x 10“ 


M 

Mo 


seconds, 
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the time scale for stellar-size objects is very small by the usual astrophysical 
standards. Thus for any collapse even approximately like the free-fall collapse 
described here, the approach to a black hole is extremely rapid. 

Let us work out the redshift seen by the distant observer as a function of time t. 
The ratio of the frequencies of a photon at emission and reception is 


Vr _ UrP^R) 
v e u%Pn(E)’ 


(11.14) 


where u E and u R arc the 4-velocities of the emitter and receiver respectively and 
p is the photon 4-momentum. The 4-velocity of our emitter riding on the star’s 
surface is 

[u E \ = [(1 - 2 li/r)~\ -(2 iic 2 /r) l/1 , 0, 0], 


whereas the 4-velocity of the stationary observer at infinity is 


[<] = [!, 0,0,0], 


Hence (11.14)reduces to 


Vr = Po(R) = I - o , P lCg) t ’ 
v e u° E p 0 (E) + u 1 E p l (E) l e Po ( e ) E . 


where we have used that fact that the Schwarzschild metric is stationary and so 
p 0 is conserved along the photon geodesic. Moreover, since p is null we require 
S^P/xPv — which in our case reduces to 


1 


c 


2 



(Po) 2 -^ 1 - 7 ^^ (Pi) 2 = 0. 


So, for a radially outgoing photon, p\ = —(1 —2 pt/r) 1 p 0 /c and we find that 


Vr 

V E 




(11.15) 


As r —2/r we see that v R —> 0, so the redshift is infinite. By Taylor-expanding 
(11.15) about r = 2/r, we find that for r close to 2/x we can write 

v R r — 2 p. 
v E 4 fi 

however, near the event horizon the time of reception is given by (11.13). Hence 


Vr 

V E 


exp 
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so that the redshift goes exponentially to infinity with a characteristic time 4/r/e. 
The computation of the luminosity is more complicated since it involves non-radial 
photon geodesics also. Nevertheless, using the above analysis we see that the time 
intervals between successive photons will also decrease as ~ exp[—cf/(4/x)] and 
so we expect the luminosity to decay exponentially as ~ exp[—c?/(2/x)]. 


11.8 Tidal forces near a black hole 


As discussed in Section 7.14, in Newtonian gravity a distribution of non¬ 
interacting particles freely falling towards the Earth will be elongated in the 
direction of motion and compressed in the transverse directions, as a result of 
gravitational tidal forces. The same effect occurs in a body falling towards a 
spherical object in general relativity, but if the object is a black hole then the 
effect becomes infinite at r — 0. 

We may calculate the tidal forces in the Schwarzschild geometry, working in 
traditional Schwarzschild coordinates (t, r, 6, <p). At any particular point in space, 
the tidal forces have the same form for any (close) pair of particles that are in 
free fall. Thus, it is easiest to calculate the tidal forces at some coordinate radius 
r for the case in which the two particles arc released from rest at r. In this case, a 
frame of orthonormal basis vectors defining the inertial instantaneous rest frame 
of one of the particles may be taken as 


(e Q r=-u» = - 

c c 

(e 2 r=-S 

r 



(«t r = 
(hr = 


i-^W 


-—81 
r sin 6 


Substituting these expressions into (7.28), together with the appropriate expres¬ 
sions for the components of the Riemann tensor in Schwarzschild coordinates, 
from (7.27) we obtain (after some algebra) that the spatial components of the 
orthogonal connecting vector between the two particles satisfy 


d 2 h = 2 ixc 1 - d 2 h = fie 2 g d 2 r = Ac 2 ^ 

dr 2 r 3 ’ dr 2 r 3 ’ dr 2 r 3 


The positive sign in the ('' -equation indicates a tension or stretching in the radial 
direction and the negative signs in the and — equations indicate a pressure or 
compression in the transverse directions. Note the 1/r 3 radial dependence in each 
case, which is characteristic of tidal gravitational forces. Moreover, the equations 
reveal that the tidal forces do not undergo any ‘transition’ at r = 2/i but become 
infinite at r = 0. 
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Let us consider an intrepid astronaut falling feet first into a black hole. The 
equations derived above will not hold exactly, since there will exist forces between 
the particles (atoms) that comprise the astronaut. Nevertheless, when the tidal 
gravitational forces become strong we can neglect the interatomic forces, and 
the equations derived above will be valid to an excellent approximation. Thus 
the unfortunate astronaut would be stretched out like a piece of spaghetti (!), as 
illustrated in Figure 11.5. In fact, not only do the tidal forces tear the astronaut 
to pieces, but the very atoms of which the astronaut is composed must ultimately 
suffer the same fate! Assuming that the limit of tolerance to stretching or compres¬ 
sion of a human body is an acceleration gradient of ~ 400 ms -2 per metre, for 
a human to survive the tidal forces at the Schwarzschild radius requires a very 
massive black hole with 

M>10 5 M o . 

If you fell towards a supermassive black hole, with say M ~ 10 9 M o (such black 
holes arc believed to lie at the centres of some galaxies; see Appendix 11B) you 
would cross the event horizon without feeling a thing. However, your fate will 
have been sealed - you will end up shredded by the tidal forces of the black 
hole as you approach the singularity, from which there is no escape. If you fell 
towards a ‘small’ black hole, of mass say 10 M@, you would be shredded by the 
tidal forces of the hole well before you reached the event horizon. 



Figure 11.5 An astronaut stretched by the tidal forces of a black hole. For a 
human to survive this stretching at the Schwarzschild radius requires a very 
massive black hole, with M > 10 5 M 0 
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11.9 Kruskal coordinates 

In our discussion of advanced and retarded Eddington-Finkelstein coordinates, we 
found that neither coordinate system was completely satisfactory. In the advanced 
coordinates the outgoing null rays arc discontinuous, and in the retarded coor¬ 
dinates the ingoing null rays arc discontinuous. It is natural to ask whether it is 
possible to find a system of coordinates in which both the incoming and outgoing 
radial photon geodesics arc continuous straight lines. Such a coordinate system 
was indeed discovered in 1961 by Martin Kruskal, and it serves also to clarify 
the structure of the complete Schwarzschild geometry. 

An obvious way to begin is to introduce both the advanced null coordinate p 
and the retarded null coordinate q that we met during our discussion of Eddington- 
Finkelstein coordinates. In the coordinates ( p , q, 6, <p) the Schwarzschild metric 
becomes 

ds 2 — ^1-—^ dpdq — r 2 (d0 2 + sin 2 6dej) 2 ), (11.16) 

where r is considered as a function of p and q, defined implicitly by 

-1 

Our new system of coordinates has some appealing properties. Most impor¬ 
tantly, the 2-space defined by 6 = constant, <f> = constant has the simple metric 

ds 2 = ^1-—^ dpdq. (11.17) 

Transforming from the null coordinates p and q to the new coordinates 


= r + 2filn 


ct = \{p+q). 


r — \ (p — q) = r + 2/z In 



(11.18) 

(11.19) 


where t is the standard Schwarzschild timelike coordinate and 7 is a radial space¬ 
like coordinate (sometimes called the tortoise coordinate !), the 2-space metric 
then becomes 



= <d 2 (x)r) IJ V dx' 1 dx v , 


( 11 . 20 ) 

( 11 . 21 ) 


where x° = ct and x 1 = r. This line element has the same form as that of a 
Minkowski 2-space (which is spatially flat) but it is multiplied by what mathemati¬ 
cians call a conformal scaling factor, H 2 (x), which is a function of position. The 
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2-space itself is curved, because the derivatives of the function L l (x) enter into the 
components of the curvature tensor, but the line element (11.21) of the 2-space 
is manifestly conformally flat. In fact, any two-dimensional (pseudo-)Riemannian 
manifold is conformally flat (see Appendix 11C), in that a coordinate system 
always exists in which the line element takes the form (11.21). We have thus 
succeeded in finding such a coordinate system for the 2-space (11.17). 

The form of the line element (11.21) has an important consequence for studying 
the paths of radially moving photons (for which dd = df = 0). Since the conformal 
factor n 2 (.r) is just a scaling, it does not change the lightcone structure and so 
the latter should just look like that in Minkowski space. Thus, in a spacetime 
diagram in ( ct, r) coordinates, both ingoing and outgoing radial null geodesics 
are straight lines with slope ±1, as is easily seen by setting ds 1 = 0 in (11.20). 

Unfortunately, however, the coordinates ( ct , r) are pathological when r = 2/x, 
as is easily seen from (11.19). This suggests that, instead of using the parameters 
p and q directly, we should look for a coordinate transformation that preserves 
the manifest conformal nature of the 2-space defined by (11.17) but removes the 
offending factor 1 —2 pt/r, which is the cause of the pathological behaviour. It 
is straightforward to see that a transformation of the form p(p) and q(q) will 
achieve this goal, since, in this case, the metric becomes 


,2 (, 2fi\dp dq 

ds = I 1-1 — — dp dq, 


which has the same general form as (11.17). An appropriate choice of the functions 
p(p) and q(q) that removes the factor (1 — 2/jl/ r) in the line element is (as 
suggested by Kruskal) 


p=exp (£)- « =_exp (“^)' 

for which we find that 

2 32/x 3 / r \ 

ds- =-exp- dpdq. 

r \ 2/x/ 

The usual form of the metric is then obtained by defining a timelike variable v 
and a spacelike variable u by 

v=\(Jp + q), u— \(p — q). 

Thus, the full line element for the Schwarzschild geometry in Kruskal coordinates 
(v, u, 6, f) is given by 

( 11 . 22 ) 
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where r is considered as a function of v and u that is defined implicitly by 

" 2 -” 2 = (i-‘) exp (£)' <n ' 23) 

It is straightforward to show that the coordinates v and u arc related to the 
original Schwarzschild coordinates t and r by the following transformations. For 
r > 2/x we have 



whereas, for r < 2/x, 



Considerable insight into the nature of the Schwarzschild geometry can be 
obtained by plotting its spacetime diagram in Kruskal coordinates. The causal 
structure defined by radial light rays is (by construction) particularly easy to 
analyse in Kruskal coordinates. From the metric (11.22), we see that for ds = 
dd = d(j) = 0 we have 


v = ±u + constant, 

which represents straight lines at ±45° to the axes. This is a direct consequence of 
the fact that the 2-space with dd = dcj> = 0 is manifestly conformally flat in (v, it) 
coordinates. Thus, the lightcone structure should look like that in Minkowski 
space. Also, we note that a massive particle worldline must always lie within the 
future light-cone at each point. 

It is also instructive to plot lines of constant t and r. From (11.23) we see that 
lines of constant r arc curves of constant ir — v 2 and arc hence hyperbolae. In 
particular, the value r = 2/x correpsonds to either of the straight lines u = ±u, 
which are the asymptotes to the set of constant- r hyperbolae, and the value r — 0 
corresponds to the hyperbolae v = ±V u 2 + 1. Thus the ‘point’ in space r = 0 
is mapped into two lines. However, not too much can be made of this since it 
is a singularity of the geometry. We should not glibly speak of it as a part of 
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spacetime with a well-defined dimensionality. Similarly, lines of constant t may 
be mapped out. It is straightforward to show that 


tanh[ci/(4/x)] = 


v/u 

u/v 


for r >2/i, 
for r < 2 /jl, 


so fixed values of t correspond to lines of constant u/v, i.e. straight lines through 
the origin. The value t = — oo corresponds to u — — v, while t = oo corresponds 
to u = v. The value t = 0 for r > 2/jl corresponds to the line v = 0, whereas for 
r < 2/jl it is the line u = 0. 

We note that the entire region covered by the Schwarzschild coordinates — oo < 
t < oo, 0 < r < oo is mapped onto the regions I and II in Figure 11.6. Thus, we 
would require two Schwarzschild coordinate patches (I, II) and (I 7 , IT) to cover the 
entire Schwarzschild geometry, but a single Kruskal coordinate system suffices. 
The diagonal lines r = 2/jl, t = oo and r = 2/jl, t = — oo define event horizons 
separating the regions of spacetime II and II' from the other regions, I and I'. 

The Kruskal diagram has some curious features. There arc two ‘Minkowski’ 
regions, I and I', so apparently there arc two universes. We can identify region I 
as the spacetime region outside a Schwarzschild black hole and region II as the 
interior of the black-hole event horizon. Any particle that travels from region I to 



Figure 11.6 Spacetime diagram of the Schwarzschild geometry in Kruskal coor¬ 
dinates. The lower and upper wavy lines at the boundaries of the shaded regions 
are respectively the past singularity and the future singularity at r = 0. The 
broken-line arrows show escaping signals. 
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region II can never return and, moreover, must eventually reach the singularity 
r = 0. Regions I' and II' are completely inaccessible from regions I or II. Region 
II' is similar to region II but in reverse: it is a part of spacetime from which 
a particle can escape (into regions I and I') but not enter. Moreover, there is 
a singularity in the past - a white hole - from which particles can emanate. 
Indeed, we may now understand more clearly our discussion of the advanced 
and retarded Eddington-Finkelstein coordinates in Section 11.5: the advanced 
coordinates describe the Schwarzschild geometry in regions I and II, whereas the 
retarded coordinates cover the regions I' and II'. The two universes I and I' arc 
actually connected by a wormhole at the origin, which we discuss in more detail 
in the next section, but, as we will show, no particle can travel between regions 
I and I'. 

It is worth asking what has happened here. How can a few simple coordinate 
transformations lead to what is apparently new physics? What we have done 
amounts to mathematically extending the Schwarzschild solution. Mathematicians 
would call this a maximal extension of the Schwarzschild solution because all 
geodesics either extend to infinite values of their affine parameter or end at a 
past or future singularity. Thus Kruskal coordinates probe all the Schwarzschild 
geometry. Hence, we find that the complete Schwarzschild geometry consists of 
a black hole and white hole and two universes connected at their horizons by a 
wormhole. 

The extended Schwarzschild metric is a solution of Einstein’s theory and hence 
is allowed by classical general relativity. Thus, for example, classical general 
relativity allows the existence of white holes. Photons or particles could, in 
principle, emanate from a past singularity. But, as you can see from the Kruskal 
spacetime diagram, you cannot ‘fall into’ a white hole since it can only exist in 
your past. Can a white hole really exist? The answer is that we don’t know for 
sure. Classical GR must break down at singularities. We would expect quantum 
effects to become important at ultra-short distances and ultra-high energies. In 
fact, from the three fundamental constants G, h and c we can form the following 
energy, mass, time, length and density scales: 

Planck energy E P = (he 5 /G^ = 1.22 x 10 19 GeV, 

Planck mass m P — (hc/G )^ 2 =2.18xl0~ 5 g, 


Planck time 
Plancklength 
Planck density 


t P = (hG/c 5 ^ 1 ' 2 = 539 x 10 _44 s, 
l P =(hG/c 3 ) l/2 = 1.62x 10“ 33 cm, 
p P =(c 5 /hG 2 ^j = 5.16 x 10 93 gcm~ 3 . 
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These Planck scales define the characteristic energies, lengths, times, etc. at which 
we expect quantum gravitational effects to become important. To put it into some 
kind of perspective, an elementary particle with the Planck mass would weigh 
about the same as a small bacterium. 

Nobody really expects the centres of black holes to harbour true singularities. 
Instead, it is expected that, close to the classical singularity, quantum gravitational 
effects will occur that will prevent the divergences of classical general relativity. 
We do not yet have a complete theory of quantum gravity, though many people 
hope that M-theory (formerly known as superstring theory) might one day provide 
such a theory. Theorists have developed semi-classical theories, however, which 
might (or might not) contain some of the features of a complete theory of quantum 
gravity. Such calculations suggest that white holes would be unstable and could 
not exist for more than about a Planck time. It is interesting that within a few 
pages we have pushed Einstein’s theory of gravity to the edge of known physics. 


11.10 Wormholes and the Einstein-Rosen bridge 

Although it is not obvious from Figure 11.6, the two universes I and T are actually 
connected by a wormhole at the origin. To understand the structure at the origin, 
you must realize that the coordinates 6 and (b have been suppressed in this figure; 
each point in Figure 11.6 actually represents a 2-sphere. 

We can gain some intuitive insight into wormholes by considering the geometry 
of the spacelike hypersurface v = 0, which extends from u = +oo to u — — oo. 
The line element for this hypersurface is 

ds 2 = — ^ ^ exp ( -^ du 2 — r 2 (dd 2 + sin 2 6 dtp 2 ). 

r V 2 p.J 

We can draw a cross-section of this hypersurface corresponding to the equatorial 
plane 6 = tt/2, in which the line element reduces further to 

— ) du 2 — r 2 d(j) 2 . (11.24) 

2pJ 



To interpret this, we may consider a two-dimensional surface possessing a 
line element da 2 given by minus (11.24) and embed it in a three-dimensional 
Euclidean space. 

This embedding is most easily performed by re-expressing da 2 in terms of the 
coordinates r and (f>, which is easily shown to yield the familiar - form 


da 2 



-l 


dr 2 + r 2 d(f> 2 . 


(11.25) 
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However, we must remember that, in the spacelike hypersurface v = 0, as we 
move along the w-axis from +oo to — oo the value of r decreases to a minimum 
value r = 2/jl (at u = 0) and then increases again. In general, in Euclidean space, a 
2-surface parameterised by arbitrary coordinates (£, rj) can be specified by giving 
three functions x a (£, r])(a = 1,2, 3), where the x“ define some coordinate system 
in the three-dimensional Euclidean space. In our particular case, it will be useful 
to use cylindrical polar coordinates (p, ip, z), in which case the line element of 
the three-dimensional space is 

dcr 2 = dp 2 + p 2 dip 2 + dz 2 ■ (11.26) 

Moreover, since the 2-surface we wish to embed (which is parameterised by the 
coordinates r and c p) is clearly axisymmetric, we may take the three functions 
specifying this surface to have the form 

p = p(r), ip — <f>, z = z(r). 


Substituting these forms into (11.26), we may thus write the line element on the 
embedded 2-surface as 


dcr 2 



dr 2 + p 2 dcp 2 . 


(11.27) 


For the geometry of the embedded 2-surface to be identical to the geometry of 
the 2-space of interest, we require the line elements (11.25) and (11.27) to be 
identical, and so we require p(r) = r and thus 


1 + 




The solution to this differential equation is easily found to be 
z(r) — ^Sp.(r — 2p.) + constant, 


and substituting r = p gives us the equation of the cross-section of the embedded 
2-surface in the (p, /(-plane of the Euclidean 3-space. Taking the constant of 
integration to be zero, and remembering that r (and hence p) is never less than 
2p,, we find that the surface has the form shown in Figure 11.7. Thus, the 
geometry of the spacelike hypersurface at v — 0 can be thought of as two distinct, 
but identical, asymptotically flat Schwarzschild manifolds joined at the ‘throat’ 
/- = 2p by an Einstein-Rosen bridge. If one so wishes, one can also connect the 
two asymptotically flat regions together in a region distant from the throat. In this 
case, the wormhole connects two distant regions of a single universe. 

In either case, the structure of the wormhole is dynamic. One is used to thinking 
of the Schwarzschild geometry as ‘static’. However, working for the moment in 
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<j >=constant u = constant 



terms of the traditional Schwarzschild coordinate, it is only in regions I and I' that 
t is timelike and the fact that the metric coefficients are independent of t means 
that spacetime is static. In regions II and II', the /-coordinate is spacelike and 
the r-coordinate is timelike. Since the metric coefficients do depend explicitly on 
r, the spacetime in these regions is no longer static but evolves with respect to 
this timelike coordinate. Returning to Kruskal coordinates, consider the spacelike 
hypersurface v = 0. As this surface is pushed forwards in time (in the + v direction 
in the Kruskal diagram), paid of it enters region II and begins to evolve. 

As v increases, the picture of the geometry of the hypersurface is qualitatively 
the same as that illustrated in Figure 11.7, but the bridge narrows, the universes 
now joining at r < 2/jl. At v = I. the bridge pinches off completely and the two 
universes simply touch at the singularity r = 0. For larger values of v the two 
universes, each containing a singularity at r = 0, arc completely separate. Since 
the Kruskal solution is symmetric in v, the same things happen for negative values 
of v. The full time evolution is shown schematically in Figure 11.8. Thus, the 
two universes arc disconnected at the beginning, each containing a singularity of 
infinite curvature (r = 0). As they evolve in time, their singularities join each 
other and form a non-singular - bridge. The bridge enlarges until at v = 0 it reaches 
a maximum radius at the throat equal to r = 2/x. It then contracts and pinches off. 



zx: 

X 

) c 

X 

IX 


!•<— 1 

V = -1 

-t < v < 0 

v = 0 

0 < v < 1 

v= r 

V > 1 


Figure 11.8 Time evolution of the Einstein-Rosen bridge. 
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leaving the two universes disconnected and containing singularities (r = 0) once 
again. 

Sadly, it is impossible for a traveller to pass through the wormhole from one 
universe into the other, since the formation, expansion and collapse of the bridge 
occur too rapidly. By examining the paths of light rays in the Kruskal diagram, we 
can deduce that no particle or photon can pass across the bridge from the faraway 
region of one universe to the faraway region of the other without getting caught 
and crushed in the throat as it pinches off. Nevertheless, after falling through 
the horizon of the black hole, a traveller could see light signals from the other 
universe through the throat of the wormhole. Unfortunately, the penalty for seeing 
the other universe is death at the singularity. 

Can wormholes exist in Nature? Can they connect different universes, or differ¬ 
ent parts of the same universe? Again, nobody knows for sure. Many theorists 
would argue that we need to understand quantum gravity to understand worm- 
holes. Wormholes arc probably unstable, but ‘virtual’ wormholes arc a feature of 
some formulations of quantum gravity. 


11.11 The Hawking effect 

So far our discussion of black holes has been purely classical. Indeed, we have 
found that classically nothing can escape from the within the event horizon 
of a black hole; that is why they are called black holes! However, in 1974, 
Stephen Hawking applied the principles of quantum mechanics to electromagnetic 
fields near a black hole and found the amazing result that black holes radiate 
continuously as a blackbody with a temperature inversely proportional to their 
mass! Hawking’s original calculation uses the techniques of quantum field theory, 
but we can derive the main results very simply from elementary arguments. 

According to quantum theory, even the vacuum of empty space exhibits quan¬ 
tum fluctuations, in which particle-antiparticle pairs arc created at one event 
only to annihilate one another at some other event. Pair creation violates the 
conservation of energy and so is classically forbidden. In quantum mechanics, 
however, one form of Heisenberg’s uncertainty principle is AtAE = h, where A E 
is the minimum uncertainty in the energy of a particle that resides in a quantum 
mechanical state for a time At. Thus, provided the pair annihilates in a time less 
than At = h/AE, where A E is the amount of energy violation, no physical law 
has been broken. 

Let us now consider such a process occurring just outside the event horizon of a 
black hole. For simplicity, let us consider a Schwarzschild black hole in (t, r, d, 4>) 
coordinates. Suppose that a particle-antiparticle pair is produced from the vacuum 
and that the constituents of the pair have 4-momenta p and p respectively. Since 
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the spacetime is stationary (d 0 g^p = 0), the quantities p 0 = e 0 p and p 0 = e () ■ p arc 
conserved along the particle worldlines; here e 0 is the /-coordinate basis vector. 
Thus, for a fluctuation from the vacuum, classical conservation requires 

e Q p + e 0 -p = 0. (11.28) 

The squared ‘length’ of the coordinate basis vector e 0 is given by 


eo-e 0 = goo = c 2 (l-2p/r). (11.29) 

Thus, outside the horizon ( r > 2/x), e 0 is timelike. The components e Q p and e 0 p 
are therefore proportional to the particle energies as measured by an observer 
whose 4-velocity is along the <? 0 -dircction. Hence both must be positive, so the 
conservation condition (11.28) cannot be satisfied. 

However, if the fluctuation occurs near the event horizon then the inward- 
moving particle may travel to the region r < 2/x. Inside the event horizon e 0 
is spacelike , as shown by (11.29). Thus e 0 p is a component of the spatial 
momentum for some observer and so may be negative. Hence the conservation 
condition (11.28) can be satisfied if the antiparticle (say) crosses the horizon with 
negative e 0 p and the particle escapes to infinity with positive e 0 p. As seen by 
an observer at infinity, the black hole has emitted a particle of energy e 0 p and 
the black hole’s mass has decreased by (e 0 p)c 2 as a consequence of the particle 
falling into it. This is the Hawking effect. Of course, the argument is equally valid 
if it is the particle that falls into the black hole and the antiparticle that escapes 
to infinity. The black hole emits particles and antiparticles in equal numbers. 

For a fluctuation near the horizon, the inward-travelling particle needs to endure 
in a prohibited negative e 0 p condition only for a short proper time, as measured by 
some locally free-falling observer, before reaching the inside of the horizon where 
negative e 0 p is allowed. The particle has, in fact, tunnelled quantum mechanically 
through a region outside the horizon, where negative e 0 ■ p is classically forbidden, 
to a region inside the event horizon where it is classically allowed. The process 
works best where the proper time in the forbidden region is smallest, i.e. close to 
the horizon. 

The distant observer sees a steady flux of particles and antiparticles. The flux 
must be steady, since the geometry is independent of t and so the rate of particle 
emission must also be independent of /. Let us calculate the typical energy of 
such a particle as measured by the distant observer. Suppose that the particle- 
antiparticle pair is created at some event P with coordinate radius R — 2/x + e. Let 
us consider this event as viewed by a freely falling observer, starting from rest at 
this point. Since the observer is in free fall, the rules of special relativity apply in 
his frame. A typical measure of the proper time At elapsed before the observer 
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reaches the horizon may be obtained by considering a radially free-falling particle 
that starts from rest at r = R. In this case, 

(l-2/x//?) 1/2 


t = 


1 — 2/z/r 

w (hi 


Thus the required proper time interval is 

■2m (2lie 2 2 /jlc 1 x -1/2 

J 2fc+e 


At = 


1/2 


dr ■ 


2(2 /re) 1 / 2 


2 /i + e, 

where the final result is quoted to first order in e. From the uncertainty principle, 
the typical energy £ of the particle, as measured by a freely falling observer, is 
given by 

h he 


At 2(2/xe) -1 / 2 
However, this can also be written as 


£ =pu 


A)M°, 


where « is the observer’s 4-velocity and the approximation holds since u 1 <5C w°. 
Now, m° = ? ~ (2/x/e) 12 to first order in e. Moreover, p () is conserved along 
the particle’s worldline and is equal to the energy E of the particle as measured 
by the distant observer, whose 4-velocity is simply [« M ] = (1,0, 0, 0). Thus, we 
finally obtain 


e \ 1/2 _ 

2fi) ~ 4GM 


(11.30) 


Remarkably, this result does not depend on e: the particle always emerges with 
this characteristic energy. 

The full quantum field theory calculation shows that the particles arc in 
fact received with a blackbody energy spectrum characterised by the Hawking 
temperature 


T = 


he 3 

8 Trk B GM' 


The typical particle energy is thus E = k B T = he 3 /(SttGM), which is only a 
factor 2tt smaller than our crude estimate (11.30). Putting in numbers, we find that 


T = 6x 10“ 8 



K. 
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Thus the radiation from a solar-mass black hole, such as might be formed by the 
gravitational collapse of a massive star, is negligibly small. 

It is straightforward to calculate the rate dM/dt at which the black hole loses 
mass, as determined by a stationary distant observer whose proper time is t. Since 
the black-hole event horizon emits radiation as a blackbody of temperature T, the 
black-hole mass must decrease at a rate 


dM _ (tT 4 A 
dt c 2 


where a = TT 2 k^/{60h 3 c 2 ) is the Stefan-Boltzmann constant and A is the proper 
area of the event horizon. From the Schwarzschild metric we find that A = I 6tt/jl 2 , 
and so we obtain 


dM ah 

dt M 2 ’ 


(11.31) 


where the dimensionless constant a = c 4 /(15 3607tG 2 ) = 3.76 x 10 49 . The solu¬ 
tion M(t ) to (11.31) is easily calculated. For a black hole whose evaporation is 
complete at time / 0 , we find that 


M(t ) = [3 ah(t 0 — f)] 1//3 . 


(11.32) 


This result shows that a burst of energy is emitted right at the end of a black 
hole’s life. For example, in the final second it should emit ~ 10 22 J of energy, 
primarily as y-rays. No such events have yet been identified. 


Appendix 11A: Compact binary systems 

One of the best ways of finding candidate black holes is to search for luminous 
compact X-ray sources. The reason is that if a black hole has a stellar companion 
then the intense tidal field can pull gas from the companion, producing an accretion 
disc around the black hole. A schematic picture is shown in Figure 11.9. As we 
showed in Chapter 10, accretion discs can radiate very efficiently and we would 
expect to observe high-energy (X-ray) photons emitted from a small region of 
space. 

Table 11.1 summarizes the two common classes of compact binaries. The 
compact object can be a white dwarf, neutron star or black hole. If you find a 
compact binary system then you can set limits on the mass of the compact object 
from the dynamics of the binary orbit. If you find evidence for a compact object 
that is more massive than the Chandrasekhar limit then you have good evidence 
that the object might be a black hole. 
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Table 11.1 Compact accreting binary systems 




Compact object 


Companion star 

White dwarf 

Neutron star 

Black hole 

Early type, massive 

None known 

Massive X-ray binaries 

Cyg X-I 

Late type, low mass 

Cataclysmic variables 
(e.g., dwarf novae) 

Low mass X-ray binaries 

A0620-00 



Figure 11.9 Schematic picture of a compact binary system. 


In fact it is not so straightforward. What observers actually measure is the mass 
function 


f(M) = 


PK 3 
2t tG' 


where P is the orbital period, and K is the radial velocity amplitude. For example, 
for the low-mass X-ray binary A0620-00 the period is P = 7.7 hours and K = 
457kms _1 . From Kepler’s laws we can show that the mass function is related 
to the masses M x and M 2 of the compact object and the companion star and the 
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Table 11.2 Derived parameters and dynamical mass measurements of SXTs 


Source 


p(gcm 3 ) 

<?(= mjm 2 ) 

i 

M,(M q ) 

M 2 (M q ) 

V404 Cyg 

6.08 ±0.06 

0.005 

17 ± 1 

55 ±4 

12±2 

0.6 

G2000 + 25 

5.01 ±0.12 

1.6 

24± 10 

56± 15 

10±4 

0.5 

N Oph 77 

4.86±0.13 

0.7 

>19 

60 ±10 

6±2 

0.3 

N Mus 91 

3.01 ±0.15 

1.0 

8±2 

S4+ 20 

6 +5 

u -2 

0.8 

A0620-00 

2.91 ±0.08 

1.8 

15 ± 1 

37 ±5 

10 ± 5 

0.6 

J0422 + 32 

1.21 ±0.06 

4.2 

>12 

20-40 

10 ± 5 

0.3 

J1655-40 

3.24 ±0.14 

0.03 

3.6±0.9 

67 ±3 

6.9± 1 

2.1 

4U1543-47 

0.22 ±0.02 

0.2 

— 

20-40 

5.0±2.5 

2.5 

Cen X-4 

0.21 ±0.08 

0.5 

5± 1 

43 ±11 

1.3 ±0.6 

0.4 


inclination angle i of the orbit to the plane of the sky by 

M , 3 sin 3 i 

/(M) = (M!+M 2 ) 2 ' 


You can see from this equation that the mass function is a strict lower limit on the 
mass of the compact object. It is equal to the latter, f = only if M 2 = 0 
and the orbit is viewed edge on (so that sin / = 1). For example, for A0620 —00 
the lower limit on the mass of the compact object is 2.9M 0 , and this makes it 
a very good black hole candidate because this mass limit is very close to the 
theoretical upper limit for the mass of a neutron star. In fact, it is possible to 
make reasonable estimates 3 for M 2 and sin / in this system, leading to a probable 
mass of ~ 10 M g for the compact object - well into the black-hole regime. 

Table 11.2 summarises the dynamical mass limits on some good black-hole 
candidates (so-called short X-ray transients). As you can see, in several systems, 
such as V404 Cyg, G2000 + 25 and N Oph 77, the minimum mass inferred from 
the mass function is well above the theoretical maximum mass limit for a neutron 
star. As we understand things at present there can be no other explanation than 
that the compact objects are black holes. 


Appendix 11B: Supermassive black holes 

The first quasar 4 (3C273) was discovered in 1963 by Maarten Schmidt. He 
measured a cosmological redshift of z = 0.15 for this object, which was 


3 An estimate of the mass M 2 can be made by measuring the spectral type and luminosity of the companion 
star. The inclination angle can be estimated from the shape of the star’s light curve by searching for evidence 
of eclipsing by the compact object. 

4 Quasi-stellar radio source. We now know that the majority of quasars are radio quiet, and so they are often 
called QSOs for quasi-stellar object. 
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unprecedently high at the time (quasars have since been discovered with redshifts 
as high as z = 5.8). Quasars arc very luminous, typically 100-1000 times brighter 
than a large galaxy. However, they arc compact, so compact, in fact, that quasars 
look like stars in photographs. In fact, from variability and other studies one can 
infer that the size of the continuum-emitting region of a quasar is of order a few 
parsecs or less. How can we explain such a phenomenon? Imagine an object 
radiating many times the luminosity of an entire galaxy from a region smaller 
than the Solar System. Donald Lynden-Bell was one of the first to suggest that 
the quasar phenomenon is caused by accretion of gas onto a supermassive black 
hole residing at the centre of a galaxy. The black-hole masses required to explain 
the high luminosities of quasars arc truly spectacular - we require black holes 
with masses a few million to a few billion times the mass of the Sun. 

Do such supermassive black holes exist? The evidence in recent years has 
become extremely strong. Using the Hubble Space Telescope it is possible to 
probe the velocity dispersions of stars in the central regions of galaxies. According 
to Newtonian dynamics, we would expect the characteristic velocities to vary as 



If the central mass is dominated by a supermassive black hole then we expect the 
typical velocities of stars to increase as we go closer to the centre. This is indeed 
what is found in a number of galaxies. From the rate of increase of the velocities 
with radius, we can estimate the mass of the central object, which seems to be 
correlated with the mass of the bulge component of the galaxy: 

M bh ~ 0.006M bulge . 

It seems as though, at the time of galaxy formation, about half a percent of the 
mass of the bulge material collapses right to the very centre of a galaxy to form a 
supermassive black hole. During this phase the infalling gas radiates efficiently, 
producing a quasar. When the gas supply is used up, the quasar quickly fades 
away leaving a dormant massive black hole that is starved of fuel. Nobody has 
yet developed a convincing theory of how this happens, or of what determines 
the mass of the central black hole. 

A sceptic might argue that these observations merely prove that a dense compact 
object exists at the centre of a galaxy that is not necessarily a black hole. But there 
arc two beautiful observational results that probe compact objects on parsecond 
scales - making it almost certain that the central objects arc black holes. In our 
own Milky Way Galaxy it is possible to measure the proper motions of stars in 
the Galactic centre (using infrared wavelengths to penetrate through the dense 
dust that obscures optical light). This has allowed astronomers to see the stars 
actually moving and so infer their three-dimensional motions. These observations 
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imply that there exists a black hole of mass 2.5 x 10 6 M Q at the centre of our 
Galaxy. 

In a remarkable set of observations, a disc of H 2 0 masers has been detected in 
the galaxy NGC 4258 using very long baseline interferometry (VLBI). The VLBI 
observations measure the velocities of the masing clouds on scales of ~ 0.3-2 
parscconds and arc well fitted by a thin (actually slightly warped) disc in circular 
motion (see Figure 11.10). The mass of the central black hole is estimated to be 
4 x 10 7 M o . 

Table 11.3 lists the masses of some potential supermassive black holes, with 
a five-star rating. The masing disc of NGC 4258 gets a full five stars - this is 
the strongest observational evidence for a supermassive black hole. The stellar 


0.5 ly 



10 000 ly 


Figure 11.10 The masing H 2 0 disc in the centre of NGC 4258. The lower 
left-hand panel shows the variation in the line-of-sight velocity in kms -1 of 
the material in the disc as a function of the distance along its major axis in 
milliarcseconds. In the upper panel and lower right-hand panel the distance scales 
are given in light years. 
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Table 11.3 Potential supermassive black holes 


Rating 

Source 

44 bh /M 0 

Evidence 

* * * 

M87 

2 x 10 9 

stars and optical disc 

** 

NGC 3115 

1 x 10 9 

stars 

** 

NGC 4594 (Sombrero) 

5 x 10 s 

stars 

** 

NGC 3377 

1 x 10 9 

stars 

* * * * * 

NGC 4258 

4 x 10 7 

masing H,0 disc 

** 

M31 (Andromeda) 

3x 10 7 

stars 

** 

M32 

3 x 10 6 

stars 

* * ** 

Galactic centre 

2.5 x 10 6 

stars and 3D motions 


motions in the Galactic centre get four stars, though some astronomers might 
argue that this evidence is so strong that it should rate five stars. Most of the other 
observations arc based on measurements of stellar-velocity dispersion. This is 
fairly strong evidence but not completely convincing 5 and so rates only two stars. 


Appendix 11C: Conformal flatness of two-dimensional Riemannian 
manifolds 

Consider a general two-dimensional (pseudo-)Riemannian manifold in which the 
points arc labelled with some arbitrary coordinate system x a (a = 1,2). For any 
such manifold to be conformally flat, we require that we can always find a 
coordinate system x' a in which the metric takes the form 

8ab( x ') = a2 &)Vab’ (11.33) 

where Qr(x') is an arbitrary function of the new coordinates and [r] ah \ = 
diag(±l, ±1), the signs depending on the signature of the metric. 

Suppose the primed coordinates arc given by the transformation 

x' 1 = o^jc 1 , jc 2 ) and x~ = [Mx l , x 2 ). 

In order that (11.33) is satisfied, we thus require 

g ' 12 = (d a a){d b P)g ab = 0, (11.34) 

g ,n Tg' 22 = [(d a a)(d b a) T (d a f3)(d b P)]g ab = 0, (11.35) 

where in the second equation the minus sign corresponds to the case where the 
metric is positive- or negative-definite, and the plus sign corresponds to the case 
where the metric is indeterminate. 


5 The interpretation of velocity dispersion measurements requires some assumptions about the degree of velocity 
anisotropy. 
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It is straightforward to verify that (11.34) is satisfied identically if 

d a a = K€ ab g bc d c l3, 

where k is an arbitrary function of the coordinates and e ah is the alternating 
symbol, for which e n = e 22 = 0 and e 12 = — e 21 = 1- Moreover, substituting this 
expression into (11.35), we find that 

(j=Fl'j[(d a /3)(d b /3)g ab ] = 0, 

where g = det[g a/ ,]. For a positive- or negative-definite metric the factor in square 
brackets cannot be zero and, moreover, we can guarantee that g 0. Thus, in this 
case, we can satisfy our requirements by choosing k 2 = g. For an indeterminate 
metric, however, we must require that the above factor is zero, i.e. /3 must not 
be a null coordinate. In this case, we can again guarantee that g 0, and so 
we choose k 2 = —g. Thus we have shown explicitly that any two-dimensional 
(pseudo-)Riemannian manifold is conformally flat. 


Exercises 

11.1 In the Schwarzschild geometry, we introduce the new coordinates 

x — rsindcos</>, y = r sin 6 sin 4>, z = rcos6. 

Find the form of the line element in these coordinates. 

11.2 By introducing the new coordinate p defined by 


r = P 1 + 3“ 

2p 

show that the line element for the Schwarzschild geometry can be written in the 
isotropic form 


rfj 2 = c 2 ( 1 -^y( 1 +^ " dt 1 - (l + (dp 2 + P 2 dd 2 + p~ sin 2 d d<j> 2 ). 

Show that g 0 o ^1 — 2p/p in the weak-field limit p. « pi. 

11.3 Show that the worldlines of radially moving photons in the Schwarzschild geometry 
are given by 


ct — r + 2p,ln 



+ constant 



(outgoing photon), 


ct — —r — 2/r In 


+ constant (incoming photon). 
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11.4 Show that, on introduction of the advanced Eddington-Finkelstein timelike coordi¬ 
nate t' — ct + 2/a In |/•/ (2/x) — 1|, the Schwarzschild line element takes the form 



Hence show that the worldlines of radially moving photons in advanced Eddington- 
Finkelstein coordinates are given by 


r 

ct — r + 4/jl In-1 + constant (outgoing photon), 

12/a | 

ct' — —r + constant (incoming photon). 

11.5 Show that, on introduction of the retarded Eddington-Finkelstein timelike coordinate 
t* = ct + 2/xln |r/(2/r) — 1|, the Schwarzschild line element takes the form 

ds 2 = c 2 ^1-—j dt* 2 + dt* dr — ^1 + — ^ dr 2 — r 2 (d0 2 + sin 2 9d4> 2 ). 

Hence find the equations for the worldlines of radially moving photons in retarded 
Eddington-Finkelstein coordinates. Use this result to sketch the spacetime diagram 
showing the light-cone structure in this coordinate system. 

11.6 A particle in the Schwarzschild geometry emits a radially outgoing photon at 
coordinates ( t E , r E ), which is received by the distant fixed observer at ( t R , r R ). 
Show that, if r E lies just outside the horizon r — 2/jl, the radial coordinate ‘seen’ by 
the distant observer at the time t R is given by 

r E (t R )K2 f i + 2pe- e <- , *-'*V*. 

11.7 An observer sits on the surface of a star as it collapses to form a black hole. Once 
an event horizon forms would the observer see any light from the star? 

11.8 A spherical distribution of dust of coordinate radius R and total mass M collapses 
from rest under its own gravity. Show that, as the collapse progresses, the coordinate 
radius r of the star’s surface and the elapsed proper time r of an observer sitting 
on the surface are related by 

T(,) ~ ~(iGMyi 2 f R (i — r//e) dl ' 

By making the substitution r = /■? cos 2 ((///2), or otherwise, show that the solution 
can be expressed parametrically as 

R R / R \ ^ ^ 2 

r — — (1 + cos ch), t — — ( - ) ( di + sin ih). 

2 r ’ 2 \2GM) 

Calculate the proper time experienced by the observer before the star collapses to 
a point. 
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11.9 A massive particle is released from rest at infinity in the Schwarzschild geometry. 
Show that the covariant components of its subsequent 4-velocity at coordinate 
radius r can be written as u= c 2 d fL T , where 



Hence show that the line element of the Schwarzschild geometry in ( T , r. 9, <b) 
coordinates is given by 

ds 2 = c 2 dT 2 — (^dr + ^2 /jlc 2 /r dT^J — r 2 [d6 2 + sin 2 Odd 2 ) . 


Is this new form singular at r — 2/x? What can you say about the hypersurface 
T — constant? Show that observes infalling radially from rest at infinity have 
T = 1 and hence give a physical interpretation of the T coordinate. 

11.10 A massive particle is released from rest at coordinate radius r in the Schwarzschild 
geometry. Show that a frame of orthonormal basis vectors defining the inertial 
instantaneous rest frame of the particle may be taken as 


(^ = -^ = -( 1 - —) Sff, (^ = (1 


(e 2 r = -sZ, 


(e 3 r = 


1 


rsind 


2m\' 


- 8 ?. 


8 ?, 


Hence show that the spatial components of the orthogonal connecting vector 
between two such nearby particles satisfy 


d 2 C _ 2 lie 2 , = d 2 £+ _ lie 2 ^ 

d t 2 r 3 ’ dr 2 r 3 ’ dr 2 r 3 


11.11 Two compact masses, each of mass m , are connected by a light strong wire of 
length /. The system is aligned in such a way that the two masses lie along a radial 
line from a Schwarzschild black hole, and it is released from rest at coordinate 
radius r. Obtain an expression for the tension in the wire immediately after the 
system is released. 

11.12 An astronaut, starting from rest at infinity, falls radially inwards towards a 
Schwarzschild black hole with M — 10 5 M Q . Calculate the radial coordinate from 
the centre of the black hole at which the astronaut first experiences a lateral tidal 
force of 400ms _2 m _1 and is therefore crushed. How does this radial coordinate 
compare with the position of the event horizon? 

11.13 An unpowered satellite is in radial free fall towards a Schwarzschild black hole. 
Show that the principal stresses in the satellite are given by 

2/jlc 2 /xc 2 fic 2 


In what directions do these principal stresses act? Compare your answer with that 
obtained in Exercise 11.10. 
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11.14 An unpowered satellite follows a circular orbit of radius r around a Schwarzschild 
black hole. Show that the principal stresses in the satellite are given by 

lie 2 2 r — 3 fi lie 2 r /ic 2 

r 3 r — 3fi’ r 3 r — 3/a ’ r 3 

In what directions do these principal stresses act? 

11.15 Suppose that p and q are respectively the advanced and retarded Eddington- 
Finkelstein time parameters, defined in terms of Schwarzschild coordinates by 

r 

p = ct+ r + 2/rln-1 

2 ii 

r 

q = ct — r — 2 p. In-1 

2 fi 

A new set of (Kruskal) coordinates is defined by 

v = \ (e p/4p - e~ q/4p ) and u=\ (e p/4fL + e~ q/4p ). 

Show that these new coordinates are related to Schwarzschild coordinates for 
r > 2p by 



and for r < 2p, by 



11.16 Show that, in terms of the Kruskal coordinates u and v defined in Exercise 11.15, 
the Schwarzschild line element takes the form 

ds 2 = ^ exp ^ ((?u 2 — du 1 ) — r 2 (dd 2 + sin 2 6 d(f> 2 ), 

where r is considered as a function of v and u and is defined implicitly by 

„W = (^-l).x p (i-). 

Show further that v is timelike and u is spacelike throughout the Schwarzschild 
geometry. 
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11.17 Perform an embedding into three-dimensional Euclidean space of the 2-space with 
line element 

da 1 — dr 2 + (r 2 + a 1 ) d(j) 2 . 


and hence show that the resulting 2-surface has a geometry reminiscent of a 
wormhole. 

11.18 By examining the paths of light rays in the Kruskal diagram, deduce that no 
particle can pass through the Einstein-Rosen wormhole from region I to region I' 
or vice versa, before the throat of the wormhole pinches off. 

11.19 A Schwarzschild black hole of mass M radiates as a blackbody of temperature 
T — hc 3 /(8Trk B GM). Show from first principles that the black hole has a lifetime 
r = M 3 /(3ah), where a = c 4 /(15360t 7G 2 ). In its last second, calculate the total 
energy radiated and estimate the typical energy of each radiated particle. 

11.20 From observations of a compact binary system, one may calculate the mass 
function 


AM) = 


PK 3 
2ttG’ 


where P is the orbital period and K is the radial velocity amplitude. From Kepler’s 
laws in Newtonian gravity, show that /(M) is related to the masses M x and M-, of 
the compact object and the companion star and the inclination angle i of the orbit 
to the plane of the sky by 


AM) = 


M\ sin 3 i 

(M l +M 2 y 



12 

Further spherically symmetric geometries 


In the preceding three chapters, we have considered in some detail the 
Schwarzschild geometry, which represents the gravitational field outside a 
static spherically symmetric object. We also considered the structure of the 
Schwarzschild black hole, in which the empty-space field equations arc satisfied 
everywhere except at the central intrinsic singularity. In this chapter, we consider 
solving the Einstein equations for a static spherically symmetric spacetime in 
regions where the presence of other fields means that the energy-momentum 
tensor is non-zero. In particular, we will concentrate on two physically inter¬ 
esting situations. First, we discuss the relativistic gravitational equations for the 
interior of a spherically symmetric matter distribution (or star); in this case the 
energy-momentum tensor of the matter making up the star must be included in 
the Einstein field equations. Second, we consider the spacetime geometry outside 
a static spherically symmetric charged object; once again this is not a vacuum, 
since it is filled with a static electric field whose energy-momentum must be 
included in the field equations. 


12.1 The form of the metric for a stellar interior 

Most stars in the sky arc nowhere near dense enough for general-relativistic effects 
to be important in determining their structure. This is true for main sequence stars 
(of which our Sun is an example), red giants and even such high-density objects 
as white dwarfs. Thus, most stars will never even evolve into an object that is not 
adequately described by the Newtonian theory of stellar structure. 1 For neutron 
stars, however, the extremely high densities involved (see Section 11.6) mean that 
the internal gravitational forces will be very strong, and so we expect general- 
relativistic effects to play a significant role in determining their structure and their 
stability to collapse. As a result, it is of practical (as well as theoretical) interest 


1 See, for example, S. Chandrasekhar, An Introduction to the Study of Stellar Structure, Dover, 1958. 
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to consider the relativistic equations governing the equilibrium of a centrally 
symmetric self-gravitating distribution of matter. 

Since we arc assuming spherical symmetry and a static matter distribution, the 
appropriate general form of the metric is that derived in Section 9.1, namely 


ds 2 = A{r ) dt 2 — B(r) dr 2 — r 2 (dd 2 + sin 2 ddcj) 2 ). 


( 12 . 1 ) 


As in our derivation of the Schwarzschild metric, the two functions A(r) and B(r) are 
determined by solving the Einstein equations. For our present discussion, however, 
we shall not solve the empty-space field equations R = 0, which are valid outside 
the spherical object, but instead solve the full field equations that hold in the 
interior of the object. These arc most conveniently written in the form (8.15), namely 


R pLv = -K( T pv-jTg IJiV ), 


( 12 . 2 ) 


where T /J J; is the energy-momentum tensor of the matter of which the object is 
composed, T = Tjt and k = SttG/c 4 . For the discussion in this chapter we will 
assume the matter to be described by a perfect fluid, so that 

T ,xv =(p+ff) u n u v - Pgpv> (12.3) 

where p(r) is the proper mass density and p(r) is the isotropic pressure in the 
instantaneous rest frame of the fluid, both of which may be taken as functions 
only of the radial coordinate r for a static matter distribution. Using the fact that 
u„ = c 2 , we find that 

T= (p+^)c 2 -p<5£ = pc 2 -3p, 
and so the field equations (12.2) read 

R ^v = [(p + 4) u^u v - \(pc 2 ~ P)g M „] ■ (12-4) 

As shown in Section 9.2, the off-diagonal components of the Ricci tensor R 
for the metric (12.1) arc all zero and the diagonal components arc given by 


A" A' (A' B'\ A’ 

Rm ~ ~2B + ^B\A + ~B ) ~ ~tB’ 


R ii 


2A 4A V 
1 r 


R 

R 


22 —-1 H- 

- 2 B 2 B 

33 = R 22 s i n “ 0 - 




B' 

rfi’ 


(12.5) 

( 12 . 6 ) 

(12.7) 

( 12 . 8 ) 
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It is of interest first to determine the consequences of the vanishing off-diagonal 
components of the Ricci tensor, R (u = 0 for i— 1,2,3. From the field equa¬ 
tions (12.4), and using the fact that g 0i = 0, we see immediately that we require 
uju o = 0. Combining this with u^u 11 - c 2 , we find that the covariant components 
of the fluid 4-velocity are given by 

[ M/ J = cVA(1, 0,0,0), (12.9) 


and thus the spatial 3-velocity of the fluid must vanish everywhere. In particular, 
we note that this conclusion holds without our assuming in advance that the 
proper density p and the pressure p arc independent of t. Thus, the fact the metric 
(12.1) is independent of t automatically implies that the matter distribution itself 
is static and so the object is in a state of hydrostatic equilibrium. This is another 
illustration how the equations of motion for matter follow directly from the field 
equations (see Section 8.8). 

Let us now use the diagonal (j± = v) components of the field equations (12.2) 
to obtain the differential equations that the functions A(r) and B(r) must satisfy. 
Inserting the expression (12.3) into the right-hand side of the field equations and 
using the metric (12.1), we find that 


*oo = -yK(pc 2 + 3p)A, 
*n = - yK(pc 2 -p)B , 
*22 = -\k(pc 2 - p)r 2 , 
= Rn sin - 0. 


From these equations, one quickly obtains 


*oo *i i 2R 2 2 

A B r 2 


= —2 tcpc 2 . 


( 12 . 10 ) 

( 12 . 11 ) 

( 12 . 12 ) 

(12.13) 


On substituting the expressions (12.5-12.8) for the Ricci tensor components and 
simplifying, one finds that 


rB' 0 0 

+ — = Kr-pc~, 


which can be rewritten in the form 


d 

dr 



2 2 

= Kr pc . 


(12.14) 
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Integrating this expression with respect to r, and noting that the associated constant 
of integration must be zero in order for B(r) to be non-zero at the origin (as 
demanded by (12.14)), we find that the solution for B(r) is given by 

(12.15) 

where we have defined the function 

f r 

mpr) = 47t / p(r)r 2 dr. (12.16) 

JQ 

This function is worthy of further comment, since it has the appearance of being 
the mass contained within a coordinate radius r. This interpretation is not quite 
correct, however, since the proper spatial volume element for the metric (12.1) is 
given by 

d 3 V = sjB(r) r 2 sind dr dd dip. 

Thus the proper integrated ‘mass’ (i.e. energy/c 2 ) within a coordinate radius r is 

f r _ . r r IGniff)! 1 ^ 2 

m(r) = 4 tt ppr)yf Bpr) f 2 dr = 4 tt / ppr) 1-—— r 2 dr. 

Jo Jo i c 2 r 

Nevertheless, we note that it is mpr), not m ( r) , that appeal's in the radial metric 
coefficient Bpr) in (12.15). In particular', if the object extends to a coordinate 
radius r = R. beyond which there is empty space, then the spacetime geometry 
outside this radius is described by the Schwarzschild metric with mass parameter 
M = m(R), rather than M = m(R). The difference E = M — M corresponds to the 
gravitational binding energy of the object, which is the amount of energy required 
to disperse the material of which the object consists to infinite spatial separation. 

We now turn to determining the differential equation that must be satisfied by 
the function Apr) in (12.1). In principle, this could be obtained by substituting for 
B(r) using (12.15) in any of the equations (12.10-12.13). It is more convenient and 
instructive, however, to use the conservation equation V /A T 11 " = 0 directly, from 
which the fluid equations of motion may be derived (as discussed in Section 8.3). 
Using (12.3), we may write 

\T* V = V A [(p+ kV] - VaO 

= + (12.17) 

where, in going from the first to the second line, the first term has been rewritten 
using the expression (8.24) for the covariant divergence and the second term 
has been manipulated by noting that V u y /1 " = 0 and that p is a scalar function. 
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From (12.9), however, we have u° = c/sJ~A and u l = 0, and since p and p do not 
depend on t the first term on the right-hand side of (12.17) must vanish. For the 
same reason, the second term becomes equal to (pc 1 + p)Y v q G /A. From (3.21) 
and (12.1), we have 

^00 =-^"^00 = -2^ A, 
and so the conservation equation = 0 can be written as 

^±l g » v d^A + g^d fJL p = 0. 

Multiplying through by g V(T and simplifying, one obtains 

E^tE dtr A + d a p = 0. (12.18) 

Since A is a function only of r, the above equation is trivial for cr = 0, in which 
case we recover the fact that p is independent of t. Similarly, for cr = 2 and 
cr = 3 we find that the corresponding (tangential) derivatives of p also vanish, 
as dictated by spherical symmetry. For cr = 1, however, the relation (12.18) is 
non-trivial and reads (where primes denote d/dr ) 

(12.19) 

which gives a differential equation, in terms of p(r) and /;(/•), that A(r ) must 
satisfy in hydrostatic equilibrium. 

12.2 The relativistic equations of stellar structure 

The equations (12.15) and (12.19) show how to calculate the functions A(r) and 
B(r) in the metric (12.1), given particular functions of p(r) and p(r). Specifying 
these two functions does, however, imply an equation of state p = p(p) by 
elimination of r, and this is likely to be physically unrealistic for arbitrary choices 
of p(r) and p(r). For astrophysical investigations, one is more interested in 
building models of the density and pressure distribution inside a star under the 
assumption of some (quasi-)realistic equation of state. Thus, it is usual to recast 
the results obtained in the previous section into an alternative form. 

In this approach, the first equation of stellar structure is taken simply from 
(12.16) and written as 

( 12 . 20 ) 







12.2 The relativistic equations of stellar structure 


293 


which clearly relates the functions m(r) and p(r). The next step is to obtain an 
equation linking m(r) and p{r). This is most conveniently achieved by using 
(12.7) and (12.12): 


--1 + — (— \ 
B 2B\A B) 


-^«{pc 2 -p)r 2 . 


Eliminating the functions A and B using (12.19) and (12.15) and simplifying, one 
obtains the second equation of stellar structure, 

( 12 . 21 ) 

which is also known as the Oppenheimer-Volkoff equation. As mentioned above, 
to obtain a closed system of equations we need to define the equation of state for 
the matter, which gives the pressure in terms of the density, namely 

( 12 . 22 ) 

This provides the third (and final) equation of stellar structure. We note that, for 
many astrophysical systems, the matter obeys a polytropic equation of state of 
the form p = Kp y , where K and y are both constants. In the usual notation used 
in this field, y = 1 + 1 /«, where n is known as the polytropic index. 

The closed system of three equations (12.20-12.22) contains two coupled 
first-order differential equations, and so to obtain a unique solution one must 
specify two boundary conditions. The first is straightforward, since we must have 
m (0) = 0, leaving just one further adjustable boundary condition to be specified. 
It is most common to choose this adjustable parameter to be the central pressure 
7 ^( 0 ), or equivalently the central density p(0), which can be obtained immediately 
from the equation of state (12.22). Very few exact solutions arc known for real¬ 
istic equations of state, and so in practice the system of equations is integrated 
numerically on a computer. The procedure is to ‘integrate outwards’ from r = 0 
(in practice in small radial steps of size Ar) until the pressure drops to zero. This 
condition defines the surface (r = R) of the star, since otherwise there would be 
an infinite pressure gradient, and hence an infinite force, on the material elements 
constituting the outer layer of the star. For r>R, p{r) and p(r) are both zero and 
(r) = m(R) = M, and the spacetime geometry is described by the Schwarzschild 
metric with mass parameter M. 

Before looking for particular - solutions to the set of equations (12.20-12.22), it is 
worthwhile considering briefly their Newtonian limit. In fact, the forms of (12.20) 
and (12.22) remain unchanged in this limit, and it is only the equation (12.21) for 
the pressure gradient that is simplified. In the Newtonian limit we have p <tfp and 
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therefore 4irr 3 p me 2 . Moreover, we require the metric (12.1) to be close to 
Minkowski, and so we require 2 Gm/(c 2 r) 1. Thus, the Oppenheimer-Volkoff 
equation reduces to 

dp Gm(r)p(r) 

— = - v J , (12.23) 

dr r 1 

which is simply the Newtonian equation for hydrostatic equilibrium. Comparing 
(12.23) with (12.21), we see that all the relativistic effects serve to steepen the 
pressure gradient relative to the Newtonian case. Thus, for an object to remain 
in hydrostatic equilibrium, the fluid of which it consists must experience stronger 
internal forces when general-relativistic effects arc taken into account. 


12.3 The Schwarzschild constant-density interior solution 

The simplest analytic interior solution for a relativistic star is obtained by making 
the assumption that, throughout the star, 

p = constant, 

which constitutes an equation of state. There is no physical justification for this 
assumption, but it is on the borderline of being realistic. It corresponds to an 
ultra-stiff equation of state that represents an incompressible fluid. Consequently, 
the speed of sound in the fluid, which is proportional to (dp/dp) 1 / 2 , is infinite 
(which is clearly not allowed relativistically). Nevertheless, it is believed that the 
interiors of dense neutron stars arc of nearly uniform density, and so this simple 
case is of some practical interest. 

Equation (12.20) immediately integrates to give 


jitpr 3 for r<R 

rpR 3 = M for r>R, 


(12.24) 


where R is the radius of the star, as yet undetermined, and M is the mass parameter 
for the Schwarzschild metric describing the spacetime geometry outside the star. 
Moreover, the Oppenheimer-Volkoff equation (12.21) becomes 


dp 

dr 


4ttG 

lie 4 " 


r(pc 2 + p) (pc 2 + 3p) 




-l 


This equation is separable and we may write 


r /V) dp 4 ttG r r r dr 

' Po (pc 2 + p)(pc 2 + 3p) 3c 4 Jo 1 — 87rGpr 2 /(3c 2 ) ’ 
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where p 0 = p( 0) is the central pressure of the star. Performing these standard 
integrals, one finds that 


pc 2 + 3p _ pc 2 + 3p 0 / SttG ^_ 2 \ 1/2 
pc 2 + p pc 2 + p 0 \ 3 c 2 P ) 


(12.25) 


At the surface r = R of the star, the pressure p is zero and so the left-hand side 
of the above equation equals unity. Thus, we obtain 

/ f >(1 + Pu \ 2 
\pc 2 + 3p 0 ) 


R~ = 


3c 2 


StrGp 


which gives the radius of a star of uniform density p with a central pressure p Q . 
Alternatively, we may rearrange this result and use (12.24) to obtain a useful 
expression for the central pressure. 


2 1-(1~2 p/R) 1 ' 2 
Po PC 3(1 — 2/r/i?) 1 / 2 — 1 ’ 


(12.26) 


where p = GM/c 2 . Using this expression to replace p 0 in (12.25) gives 


2 (l-2/xrVi? 3 ) 1/2 -(l-2/x//?) 1/2 

i n PC 3(1 — 2p/R) l G — (1 — 2pr 2 /R y ) { / 2 


for r < R. 


(12.27) 


To obtain the complete solution to the problem, it remains to determine the 
functions A(r) and B(r) in the metric (12.1). From (12.15) and (12.24), we 
immediately find that 

B(r)=( • (12-28) 


In particular - , we note that at the star’s surface, where r = R. the above solution 
matches with the corresponding expression from the Schwarzschild metric for 
the exterior solution. The function A(r) is obtained from (12.19), (12.24) and 
(12.27). One may fix the integration constant arising from (12.19) by imposing 
the boundary condition that A(r) matches the corresponding expression in the 
Schwarzschild metric at r = R. One then finds 


A(r) = 


4 





(12.29) 


The expressions (12.28) and (12.29) constitute Schwarzschild’s interior solution 
for a constant-density object. 
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12.4 Buchdahl’s theorem 


The most important feature of the Schwarzschild constant-density solution 
discussed above is that it imposes a constraint connecting the star’s ‘mass’ M 
and its (coordinate) radius R. To derive this constraint, one notices that (12.26) 
implies that p 0 —»■ oo as p/R — 4/9. Since pressure is a general scalar, this 
infinity will persist in any coordinate system, and so one can only avoid this 
behaviour by demanding that 


GM 4 
c^R < 9' 


(12.30) 


Although we have only shown that this constraint holds for an object of constant 
density, Buchdahl’s theorem states that (12.30) is in fact valid for any equation 
of state. This theorem can be proved directly from the Einstein equations but 
requires considerable care and lies outside the scope of our discussion. 

Equation (12.30) can be regarded as providing an upper limit on the mass of 
a star for a fixed radius. If one attempts to pack more mass inside R than is 
allowed by (12.30), general relativity admits no static solution: the hydrostatic 
equilibrium is destroyed by the increased gravitational attraction. Such a star must 
therefore collapse inwards without stopping. Throughout the collapse, the exterior 
geometry is described by the Schwarzschild metric, and so eventually one obtains 
a Schwarzschild black hole. The limit (12.30) is, in fact, quite easily reached. For 
example, the density of a neutron star is around 10 16 kg m and, assuming it to 
be of uniform density, we find from (12.30) and (12.24) that M < lx 10 31 kg. 
This is approximately 35 solar masses, which is of same order as the most massive 
stars in our Galaxy. 


12.5 The metric outside a spherically symmetric charged mass 

We now turn to our second physical application, namely the form of the metric 
outside a static spherically symmetric charged body. The exterior of such an 
object is not a vacuum, since it is filled with a static electric field. We must 
therefore once again solve the Einstein field equations for a static spherically 
symmetric spacetime in the presence of a non-zero energy-momentum tensor, 
this time representing the electromagnetic field of the object. 

Since we are assuming spherical symmetry and a static object, the general form 
of the metric is once more given by (12.1). The two functions A(r) and B(r) 
arc determined by solving the full Einstein equations outside the spherical object; 
these equations arc again most conveniently written in the form (12.2). In this 
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case, however, T pv is the energy-momentum tensor of the electromagnetic field 
of the charged object, which from Exercise 8.3 has the general form 

T txv = -fio \FppFS - (12-31) 

where F /1V = d p A v — d p A v is the electromagnetic field strength tensor and A jA is the 
electromagnetic 4-potential. The first point to note about this energy-momentum 
tensor is that it has zero trace, 

-/x 0 - - \8»F p(J Fn = o. 


Thus, in this case, the Einstein field equations (12.2) take the simplified form 

= ~kT pv . (12.32) 

In addition to the Einstein field equations, our solution must also satisfy the 
Maxwell equations. In the region outside the charged object, the 4-current density 
j 11 is zero and so the Maxwell equations read 

= 0, (12.33) 

'V (7 F tlv + 'V v F (7ll + = 0. (12.34) 

The Einstein and Maxwell equations arc coupled together, since F /JV enters the 
gravitational field equations through the energy-momentum tensor (12.31) and 
the metric g pv enters the electromagnetic field equations through the covariant 
derivative. 

The constraint imposed on the metric coefficients g pv (or gravitational fields) 
by requiring the solution to be spherically symmetric and static is embodied in the 
choice of line element (12.1). We thus begin by considering the corresponding 
consequences of these symmetry constraints for the form of the electromagnetic 
field. In this case, the electromagnetic 4-potential in ( t , r, 6, (ft) coordinates takes 
the form 

[A^]= (^,a(r), 0,0^, (12.35) 

where 4>(r) and a ( r) depend only on r and may be interpreted respectively as 
the electrostatic potential and the radial component of the 3-vector potential as 
r —»■ oo (the extra factor of 1/c multiplying tb(r) in (12.35), as compared with 
the usual form in Minkowski coordinates, is a result of taking x° = t rather than 
x° = cf, also, note that the 3-vector potential a ( r) should not be confused with 
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the function A(r) in the metric (12.1)). From (12.35), the field-strength tensor has 
the form 


< 0 

(Vl = E to Q 

v 0 


-1 0 0 \ 

0 0 0 

0 0 0 

0 0 0 ) 


(12.36) 


where E(r) is an arbitrary function of r only and may be interpreted as the radial 
component of the static electric field as r —> oo. Thus our task is to use the 
Einstein equations (12.32) and the Maxwell equations (12.33-12.34) to determine 
the three unknown functions A(r), B(r) and E(r). 

Let us begin by using the Maxwell equations. As discussed in Section 6.6, the 
equations (12.34) are automatically satisfied by the definition of F^ v . Moreover, 
from Exercise 4.10, since E jXV is antisymmetric we may rewrite the covariant 
divergence in the first Maxwell equation (12.33) to obtain 


s^" = -Ea hl ( s f=- S F»') = o, 


(12.37) 


where g is the determinant of the metric. For a diagonal line element such as 
(12.1), the determinant is simply the product of the diagonal elements, so that 
g= —A(r)B(r)r 4 sin 2 6. Given the form of F^ v in (12.36), the expression (12.37) 
yields the single equation 

< 9 , (yABr 2 F 10 ^ = 0 . 


Writing the required contravariant component as F 10 = g lfJ 'g 0v F IXJ , = g 11 g°° F t0 = 
— E/(AB ), we thus obtain the equation 

d_ / r 2 E \ _ q 
dr vVAS/ 


This integrates to give 

k^A(r)B(r) 

E W =-jT-’ 


(12.38) 


where k is a constant of integration. If we make the assumption that the metric is 
asymptotically flat then A{r) —* c 2 and B(r) — > 1 as r — > oo. Identifying E{r) with 
the radial electric field component at infinity, we thus require k = Q/(4tt€ {) c), 
where Q is the total charge of the object. 

We now turn to the Einstein equations (12.32). The Ricci tensor components 
for the metric (12.1) are given in (12.5-12.8), and the form of the electromagnetic 
field energy-momentum tensor T /JA , may be found by substituting the form (12.36) 
for F jJV into the expression (12.31). On performing this substitution, one quickly 
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finds that the off-diagonal components of T pv arc zero, and so the Einstein 
equations for /jl / v arc satisfied identically. For the diagonal components of the 


Einstein equations, one finds 

R 00 = -±Kc 2 e 0 E 2 /B, (12.39) 

R n = \kc 2 € 0 E 2 /A, (12.40) 

R 22 = -±Kc 2 e 0 r 2 E 2 /(AB), (12.41) 

i ?33 = R 22 sin 2 9, (12.42) 


where we have used the facts that F 0 ' = g ll F 0l = E/B and F, 0 = g 00 F 10 = F/A; 
we have also made use of the relation /x 0 e () = 1/c 2 . From (12.39) and (12.40), 
we immediately obtain 


FFqo + A F 11 — 0. 

On substituting the expressions (12.5, 12.6) for the Ricci tensor coefficients and 
rearranging, this yields 

A'B + B'A = 0, 


which implies that AB = constant. We may fix this constant from the requirement 
that the metric is asymptotically flat as r 00 , and so we have 

A(r)B(r) = c 2 . (12.43) 


A further independent equation may be obtained from the 22-component (12.41) 
of the Einstein equations. Inserting the expression (12.7) for the Ricci tensor 
component and using (12.38), one finds that 


A + rA' = c 2 


( 1 - GQ2 V 

\ 47re 0 c 4 r 2 / 


Noting that A + rA' = (rA)' and integrating, one thus obtains 


A(r) = c 2 


/ 2 GM GQ 2 \ 

\ c 2 r 47 re 0 c 4 r 2 )' 


where we have identified the integration constant as —2 GM/c 2 , M being the mass 
of the object, since the line element must reduce to the Schwarzschild case when 
Q = 0. The solutions for B(r) and E(r) are then found immediately from (12.43) 
and (12.38) respectively. 
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Thus, collecting our results together and defining the constants /i = GM/c 2 and 
q 2 = GQ 2 /(4tt€ () c 4 ), the line element for the spacetime outside a static spherically 
symmetric body of mass M and charge Q has the form 


ds 2 = c 1 ( 1-— + \ dt 2 — (l -— + dr 2 — r 2 {d0 2 + sin 2 ddtjr ), 

\ r r l J \ r r l J 

(12.44) 


from which one may read off the metric coefficients g^ v that determine the 
gravitational field of the object. The resulting solution is known as the Reissner- 
Nordstrom geometry. The electromagnetic F )XV of the field of the object is given 
by (12.36) with 


E(r) = 


Q 

47re 0 r 2 


12.6 The Reissner-Nordstrom geometry: charged black holes 

The Reissner-Nordstrom (RN) metric (12.44) is only valid down to the surface of 
the charged object. As in our discussion of the Schwarzschild solution, however, 
it is of interest to consider the structure of the full RN geometry, namely the 
solution to the coupled Einstein-Maxwell field equations for a charged point mass 
located at the origin r = 0, in which case the RN metric is valid for all positive r. 

Calculation of the invariant curvature scalar R llV(T pR IXV(Tp shows that the only 
intrinsic singularity in the RN metric occurs at r = 0. In the ‘Schwarzschild- 
like’ coordinates (t, r, 6, <fr), however, the RN metric also possesses a coordinate 
singularity wherever r satisfies 

A(r) = 1 — — + %- = 0, (12.45) 

r r z 

with A(r) = — l/gn(r) = £oo( r )/ c2 - Multiplying (12.45) through by r 2 and solving 
the resulting quadratic equation, we find that the coordinate singularities occur 
on the surfaces r = r ± , where 


r ± = p±(p? - q 2 ) l/2 . 


(12.46) 


It is clear - that there exist three distinct cases, depending on the relative values of 
p , 2 and q-\ we now discuss these in turn. 
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Case I: p 1 < q 1 In this case r ± are both imaginary, and so no coordinate singularities 
exist. The metric is therefore regular for all positive values of r. Since the function A(r) 
always remains positive, the coordinate t is always timelike and r is always spacelike. 
Thus, the intrinsic singularity at r — 0 is a timelike line, as opposed to a spacelike line 
in the Schwarzschild case. This means that the singularity does not necessarily lie in 
the future of timelike trajectories and so, in principle, can be avoided. In the absence 
of any event horizons, however, r = 0 is a naked singularity, which is visible to the 
outside world. The physical consequences of a naked singularity, such as the existence 
of closed timelike curves, appear so extreme that Penrose has suggested the existence 
of a cosmic censorship hypothesis, which would only allow singularities that are hidden 
behind an event horizon. As a result, the case pi 1 < q 2 is not considered physically 
realistic. 

Case 2: p 2 > q 2 In this case, r ± are both real and so there exist two coordinate 
singularities, occurring on the surfaces r = r ± . The situation at r = r_ is very similar to 
the Schwarzschild case at r = 2p. For r > r_ , the function A(r) is positive and so the 
coordinates t and r are timelike and spacelike respectively. In the region r_ < r < r + , 
however, A (r) becomes negative and so the physical natures of the coordinates t and 
r are interchanged. Thus, a massive particle or photon that enters the surface r — r + 
from outside must necessarily move in the direction of decreasing r, and thus r = r + 
is an event horizon. The major difference from the Schwarzschild geometry is that 
the irreversible infall of the particle need only continue to the surface r — r_, since 
for r < r_ the function A (r) is again positive and so t and r recover their timelike 
and spacelike properties. Within r — r_, one may (with a rocket engine) move in 
the direction of either positive or negative r, or stand still. Thus, one may avoid the 
intrinsic singularity at r = 0, which is consistent with the fact that r — 0 is a timelike 
line. Perhaps even more astonishing is what happens if one then chooses to travel 
back in the direction of positive r in the region r < r_. On performing a maximal 
analytic extension of the RN geometry, in analogy with the Kruskal extension for the 
Schwarzschild geometry discussed in Section 11.9, one finds that one may re-cross 
the surface r — r_, but this time from the inside. Once again one is moving from a 
region in which r is spacelike to a region in which it is timelike, but this time the 
sense is reversed and one is forced to move in the direction of increasing r. Thus 
r = /■_ acts as an ‘inside-out’ event horizon. Moreover, one is eventually forceably 
ejected from the surface r = r, but, according to the maximum analytic extension, the 
particle emerges into a asymptotically flat spacetime different from that from which it 
first entered the black hole. As discussed in Section 11.9, however, such matters are 
at best highly speculative, and we shall not pursue them further here. 

Case 3: pi 1 — q 2 In this case, called the extreme Reissner-Nordstrom black hole, the 
function A(r) is positive everywhere except at r — p, where it equals zero. Thus, the 
coordinate r is everywhere spacelike except at r — p, where it becomes null, and hence 
r — p is an event horizon. The extreme case is basically the same as that considered 
in case 2, but with the region r_ < r < r_ removed. 
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We may illustrate the properties of the RN spacetime in more detail by considering 
the paths of photons and massive particles in the geometry, which we now go on 
to discuss. Since the case /i 2 > q 2 is the most physically reasonable RN spacetime, 
we shall restrict our discussion to this situation. 


12.7 Radial photon trajectories in the RN geometry 

Let us begin by investigating the paths of radially incoming and outgoing photons 
in the RN metric for the case p 2 > q 2 . Since ds = dd = dtp = 0 for a radially 
moving photon, we have immediately from (12.44) that 


dt 1/2 Li q 2 \ 1 1 r 2 

^- = ± - = ( 1 -1- — ) =± ~7 -w-7’ 

dr c \ r r-) c (r - r_)(r - r + ) 


(12.47) 


where, in the second equality, we have used the result (12.46); the plus sign 
corresponds to an outgoing photon and the minus sign to an incoming photon. On 
integrating, we obtain 


ct — r- 


ct = —r + 


In 


r 

1 r- 


-1 

+ 

+ 

r 

-1 

r_ 

[ r + - r_ 

r + 


In 


-1 


In 


-1 

r + 


+ constant (outgoing), 
+ constant (ingoing). 


We will concentrate in particular on the ingoing radial photons. To develop a 
better description of infalling particles in general, we may construct the equivalent 
of the advanced Eddington-Finkelstein coordinates derived for the Schwarzschild 
metric in Section 11.5. Once again this coordinate system is based on radially 
infalling photons, and the trick is to use the integration constant as the new 
coordinate, which we denote by p. As before, p is a null coordinate and it is more 
convenient to work instead with the timelike coordinate t' defined by ct' — p — r. 
Thus, we have 


In 


r 

1 r2 


-1 

+ 

+ 

r 

-1 

r_ 

r + - r- 

r + 


ct = ct - 

0 - 

On differentiating, or from (12.47) directly, one obtains 

1 


cdt'= dp — dr=cdt + 


|_A(r) 


dr , 


(12.48) 


(12.49) 
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where A(r) is defined in (12.45). Using the above expression to substitute for c 
in (12.44), one quickly finds that 


ds 2 = c 2 \dt' 2 — 2(1 — A) dt' dr—(2 — A) dr 2 — r 2 (dd 2 + sin 2 ddcfr). 


which is the RN metric in advanced Eddington-Finkelstein coordinates. In partic¬ 
ular, we note that this form is regular for all positive values of r and has an 
instrinsic singularity at r = 0. 

From (12.47) and (12.49), one finds that, in advanced Eddington-Finkelstein 
coordinates, the equation for ingoing radial photon trajectories is 

ct' + r = constant, (12.50) 


whereas the trajectories for outgoing radial photons satisfy the differential equation 


dt' _ 2 - A 
dr A 


(12.51) 


Event horizon Event horizon 



Figure 12.1 Spacetime diagram of the Reissner-Nordstrom solution in advanced 
Eddington-Finkelstein coordinates. The straight diagonal lines are ingoing 
photon worldlines whereas the curved lines correspond to outgoing photon world¬ 
lines. 
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We may use these equations to determine the light-cone structure of the RN metric 
in these coordinates. For ingoing radial photons, the trajectories (12.50) arc simply 
straight lines at 45° in a spacetime diagram. For outgoing radial photons, (12.51) 
gives the gradient of the trajectory at any point in the spacetime diagram, and so 
one may sketch these without solving (12.51) explicitly. This resulting spacetime 
diagram is shown in Figure 12.1. It is worth noting that the light-cone structure 
depicted confirms the nature of the event horizon at r — r + . Moreover, the light- 
cones remain tilted over in the region r_ < r < r + , indicating that any particle 
falling into this region must move inwards until it reaches r = r _. Once in the 
region r < r _, the lightcones arc no longer tilted and so particles need not fall into 
the singularity r = 0. As was the case in Section 11.5 for the Schwarzschild metric, 
however, this spacetime diagram may be somewhat misleading. For an outward- 
moving particle in the region r < r_. Figure 12.1 suggests that it can only reach 
r = r_ asymptotically, but by peforming an analytic extension of the RN solution 
one can show that the particle can cross the surface r = r_ in finite proper time. 


12.8 Radial massive particle trajectories in the RN geometry 

We now consider the trajectories of radially moving massive particles for the 
case /x 2 > q 2 . To simplify our discussion, we will assume that the particles are 
electrically neutral. In this case, the particles will follow geodesics. In the more 
general case of an electrically charged particle, one must also take into account 
the Lorenz force on the particle produced by the electromagnetic field of the black 
hole. The equation of motion for the particle is then given by (6.13). 

For a radially moving particle, the 4-velocity has the form 

[h m ] = (w°, u l , 0, 0) = (t, r, 0, 0), 

where the dots denote differentiation with respect to the proper time t of the 
particle. The geodesic equations of motion, obeyed by neutral particles in the RN 
metric, arc most conveniently written in the form (3.56): 

ilfj t ^(rS/cv)^ U ■ 

Since the metric coefficients in the RN line element (12.44) do not depend on t, 
we immediately obtain 

u 0 = g 00 i = constant. 

The radial equation of motion may then be obtained using the normalisation 
condition g^u^it 1 ' = c 2 , which gives 

gooOV + gnO* 1 ) 2 ^ 2 . 
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Figure 12.2 The limits of radial motion for a neutral massive particle in the 
Reissner-Nordstrom geometry. 


Using the fact, from (12.45), that A(r) = y 0() /c 2 = — 1 /g n , one finds that 


r 2 + c 2 A(r) 


M5 

’ 


(12.52) 


This clearly has the form of an ‘energy’ equation, in which c 2 A(r) plays the role 
of a potential. Qualitative information on the properties of the radial trajectories 
can be obtained directly from (12.52) by simply plotting the function c 2 A(r); 
this plot is shown in Figure 12.2. The radial limits of the motion depend on 
the choice of the constant n 0 , as indicated. The case u 0 = c 2 corresponds to the 
particle being released from rest at infinity. In all cases, there exists an inner radial 
limit that is greater than zero. This indicates that a neutral particle moving freely 
under gravity cannot reach the central intrinisic singularity at r = 0 but is instead 
repelled once it has approached to within some finite distance. As mentioned in 
Section 12.6. performing a maximum analytic extension the RN metric suggests 
that the particle passes back through r = r_ and r = r + and ultimately emerges 
in a different asymptotically flat spacetime. 


Exercises 

12.1 For a general static diagonal metric, show that the 4-velocity of a perfect fluid in 
the spacetime must have the form 

K] = —^—(1, 0, 0,0). 

-y/goo 










306 


Further spherically symmetric geometries 


12.2 Calculate the gravitational binding energy E — M — M of a spherical star of constant 
density p and coordinate radius R. Compare your answer with the corresponding 
Newtonian result and interpret your findings physically. 

12.3 Derive the Oppenheimer-Volkoff equation from the Einstein equations for a static 
spherically symmetric perfect-fluid distribution, and show that it reduces to the 
standard equation for hydrostatic equilibrium in the Newtonian limit. 

12.4 In Newtonian gravity, show directly that the equation for hydrostatic equilibrium is 

dp(r) Gm(r)p(r) 
dr r 2 

12.5 Show that, in the Newtonian limit, the equation before (12.15) reduces to 

d<t>(r) Gm(r) 
dr r 

where d>(r) is the Newtonian gravitational potential. 

12.6 For a spherical star of uniform density p and central pressure p 0 , verify that the 
Oppenheimer-Volkoff equation requires p(r) to satisfy 

pc 2 + 3p{r) _ pc 2 + 3p 0 / 87 tG A* /2 

pc 2 + p(r) pc 2 + p 0 \ 3c 2 / 

and hence show that 

2 (1 — 2p>r 2 /R 3 ) 1 / 2 — (1 — 2p>/R) l l 2 
Py> PC 3(1 — 2pi/R) l l 2 — (1 — 2p.r 2 /f? 3 ) 1 / 2 ’ 

where R is the coordinate radius of the star. 

12.7 In Newtonian gravity, obtain the expression for p(r) for a spherical star of uniform 
density p, central pressure p 0 and radius R. Compare your result with that obtained 
in Exercise 12.6. 

12.8 Show that, for a spherical star of uniform density p, 


16c 6 

243 rrpG 3 ' 


If a photon is emitted from the star’s surface and received by a stationary observer 
at infinity, show that the observed redshift must obey the constraint z <2. Show 
also, however, that the observed redshift for a photon emitted from the star’s centre 
can be arbitrarily large. 

12.9 For a spherical star of uniform density p, show that in order for the star not to lie 
within its own Schwarzschild radius, one requires 


M 2 < 


3c 6 

32t rpG 3 ' 


Compare this limit with that derived in Exercise 12.8. 
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12.10 For a spherical uniform-density star of mass M and coordinate radius R, show that 
the line element of spatial sections with t — constant can be written in the form 

d<jI = 2^M iX 2 + S ' n2 X ^ + S ' n2 6 d ^ ' 

12.11 Consider a static infinitely long cylindrical configuration of matter that is invariant 
to translations and Lorentz boosts along the axis of symmetry (a cosmic string). 
Adopting ‘cylindrical polar’ coordinates ( ct , r, <f>, z), show that a self-consistent 
solution to the Einstein field equations may be obtained if the stress-energy tensor 
for the matter is of the form 

[7 Viv ] = diag(pc 2 , 0, 0, —pc 2 ), 

such that there is a negative pressure (or tension) along the string, and the line 
element is of the form 


ds 2 = c 2 dt 2 — dr 2 — B(r) dcf> 2 — dz 2 . 


where B(r) satisfies 


B” (jB'f 
2B ~ 4B 2 


—Kpc 2 . 


Show further that b(r) = y/B(r ) satisfies b" — —Kc 2 pb. 

Hint: You may find your answers to Exercises 8.9, 9.28 and 9.29 useful. 

12.12 Suppose that the matter distribution in a cosmic string has a uniform density across 
the string, such that 


P(r) = 


for r < r 0 , 
for r > r 0 . 


By demanding that -> — r 2 as r -> 0, so that the spacetime geometry is regular 
on the axis of the string, show that the line element for r < r 0 is 

ds 2 = c 2 dt 2 — dr 2 — ^ d(j) 2 — dz 2 . 


where A = y/Kp 0 c 2 . By demanding that and its derivative with respect to r 
are both continuous at r — r 0 , show that the line element for r > r 0 is 


ds 2 = c 2 dt~ — dr 2 — 


sin Ar, 


Ar 


- + (r — r 0 )cosAr 0 


d(jf — dz 2 


For the interesting case in which Ar 0 1, show that for r^> r 0 the line element 
takes the form 
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where pu — Ttr^p 0 is the ‘mass per unit length’ of the string. Interpret this line 
element physically. 

12.13 Show that the electromagnetic field tensor outside a static spherically symmetric 
charged matter distribution has the form 


/ 0 

-1 

0 

0 \ 

1 

0 

0 

0 

0 

0 

0 

0 

V 0 

0 

0 

0 ) 


where E(r) is some arbitrary function. Hence show that, if the line element outside 
the matter distribution has the form 

ds 2 — A(r) dt 2 — B(r) dr 2 — r 2 (dd 2 + sin 2 dd(p 2 ). 


the energy-momentum tensor of the electromagnetic field in this region is given by 


[7^]= ^c 2 e 0 £ 2 diag|i, 


1 r 2 E 2 r 2 E 2 sin 2 d\ 
A’ ~AB' AB J ' 


12.14 Calculate the invariant curvature scalar R^ vpa .R pvp,T for the Reissner-Nordstrom 
geometry and hence show that the only intrinsic singularity occurs at r — 0. 

12.15 Show that the worldlines of radially moving photons in the Reissner-Nordstrom 
geometry are given by 


ct — r — 



ct——r + 







+ constant 


r 



+ constant 


(outgoing), 

(ingoing). 


12.16 Show that, by introducing the advanced Eddington-Finkelstein timelike coordinate 



r 2 


— 1 

+ + In 

-1 

r_ 

r+-r_ 

r + 


the Reissner-Nordstrom line element takes the form 


ds 2 = c 2 A dt' 2 — 2(1 — A) dt' dr — (2 — A) dr 2 — r 2 (d0 2 + sin 2 6 dcfr), 

where A = A(r) = 1 — 2 pt/r+ q 2 /r 2 . Hence show that the worldlines of radially 
moving photons in advanced Eddington-Finkelstein coordinates are given by 

, .. . , dt' 2 —A . 

ct + r — constant (incoming), c — =- (outgoing). 

dr A 

What is the significance, if any, of the fact that cdt'/dr — 0 at A (r) = 2 for 
outgoing radially moving photons? 
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12.17 For a particle of mass m and charge e in geodesic motion in the Reissner- 
Nordstrom geometry, show that the quantity 


, 2 iju q 2 \ dt 

k = m | 1 - — + 4 ) 1- 

dT 


eq 

r 


is conserved, and interpret this result physically. 

12.18 An observer is in a circular orbit of coordinate radius r — R in the Reissner- 
Nordstrom geometry. Find the components of the magnetic field measured by the 
observer. 



13 

The Kerr geometry 


The Schwarzschild solution describes the spacetime geometry outside a spheri¬ 
cally symmetric massive object, characterised only by its mass M. In the previous 
chapter we derived further spherically symmetric solutions. Most real astrophys- 
ical objects, however, are rotating. In this case, a spherically symmetric solution 
cannot apply because the rotation axis of the object defines a special direction, so 
destroying the isotropy of the solution. For this reason, in general relativity it is not 
possible to find a coordinate system that reduces the spacetime geometry outside 
a rotating (uncharged) body to the Schwarzschild geometry. The non-linear field 
equations couple the source to the exterior geometry. Moreover, a rotating body 
is characterised not only by its mass M but also by its angular momentum ./, and 
so we would expect the corresponding spacetime metric to depend upon these 
two parameters. 

We now consider how to derive the metric describing the spacetime geometry 
outside a rotating body. Since the mathematical complexity in this case is far 
greater than that encountered in deriving the Schwarzschild metric (or the other 
spherically symmetric geometries discussed in the previous chapter), we shall 
content ourselves with just an outline of how the solution may be obtained. 


13.1 The general stationary axisymmetric metric 

In our derivation of the Schwarzschild solution, we began by constructing the 
general form of the static isotropic metric. We are now interested in deriving the 
spacetime geometry outside a steadily rotating massive body. Thus we begin by 
constructing the general form of the stationary axisymmetric metric. 

For the description of such a spacetime, it is convenient to introduce the 
timelike coordinate t{= jc°) and the azimuthal angle <!)(= x 3 ) about the axis of 
symmetry. The stationary and axisymmetric character of the spacetime requires 
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that the metric coefficients g^ v be independent of t and 4>, so that 

8nv = 8iiv(.x 1 ,x 2 ), 

where x 1 and x 2 arc the two remaining spacelike coordinates. 

Besides stationary and axisymmetry, we shall also require that the line element 
is invariant to simultaneous inversion of the coordinates t and <fi, i.e. the trans¬ 
formations 

t —»■ —t and y ~4>- 

The physical meaning of this additional requirement is that the source of the 
gravitational field, whatever it may be, has motions that arc purely rotational 
about the axis of symmetry, i.e. we arc considering the spacetime associated with 
a rotating body. This assumed invariance requires that 

#01 = £02 = gl3 = <?23 = 

since the corresponding terms in the line element would change sign under the 
simultaneous inversion of t and cj). Therefore, under the assumptions made thus 
far, the line element must have the form 

ds 2 = g QO dt 2 + 2g 03 dtd(f) + g 33 d<j> 2 + [g u {dx 1 ) 2 + 2g n dx' dx 2 + g 23 {dx 2 ) 2 ]. 

(13.1) 

We note that, since the metric coefficients g l±l , arc functions only of x 1 and 
x 2 , the expression in square brackets in (13.1) can be considered as a separate 
two-dimensional submanifold. A further reduction in the form of the metric can 
thus be achieved by using the fact that any two-dimensional (pseudo-)Riemannian 
manifold is conformally flat, i.e. it is always possible to find a coordinate system 
in which the metric takes the form 

Sab = tt 2 ( x )Vab’ ( 13 . 2 ) 

where (l 2 (x) is an arbitrary function of the coordinates and [rj ab ] = diag(±l, ±1); 
the signs depend on the signature of the manifold. We proved this result in 
Appendix 11C. Thus, taking advantage of this fact, and writing the result in way 
suggestive of a rotating body, we can express the line element (13.1) in the form 

ds 2 = Adt 2 — B(dct> — co dt) 2 — C [(Jx 1 ) 2 + (Jx 2 ) 2 ], (13.3) 

where A, B, C and co arc arbitrary functions of the spacelike coordinates x 1 and x 2 . 

For definiteness, let us denote the coordinates x 1 and x 2 by r and 6 respectively. 
For our axisymmetric metric, these coordinates arc not so readily associated with 
any geometrical meaning. Nevertheless, in order that they can be chosen later to 
be as similar as possible to the spherically symmetric r and 6 , it is useful to allow 
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some extra freedom in the metric by not demanding that the metric coefficients 
g 2 2 and £33 be identical. Thus, from now on we will work with the metric 


ds 2 = Adt 2 —B(d<]) — codt) 2 — C dr 2 — Ddd 2 , 


(13.4) 


where A, B, C, D and co are arbitrary functions of the spacelike coordinates r 
and 6 but we have the freedom to relate C and D in such a way so that the 
physical meanings of r and 6 arc as close as possible to the spherically symmetric 
case. The functions in (13.4) are related to the metric coefficients g /lt , by 

8tt~ A — Bco 2 , g t4> = Bco, g H = ~B, g rr = -C, g ee = -D, 


where, from now on, we use coordinate names rather than numbers to denote the 
components. Note that co = —g u i,/g<!><b and, if the body is not rotating, we can set 
co = 0 since in this case we would require that the metric is invariant under the 
single transformation t —»■ —t and consequently g t( j ) = 0 . 

For later convenience, let us also calculate the contravariant components g^ v of 
the metric corresponding to the line element (13.4). The only off-diagonal terms 
involve t and cf>, and so immediately we have 

g" = —1/C, g 0e = -l/D. 


To find the remaining contravariant components, we must invert the matrix 


/ get 

gt<i> j 

=> 

G " 1 = — 

gf/xj) 



8W>) 


|G| 

\~8t<l> 

gtt ) 


where the determinant |G| = g t ,g ( !, ( i, — (gut,) 2 = — AB. Thus 

tt _ _ J_ t4> _ _ §t<fr _ to 00 _ gti_ _ Bco" — A 

° |G| A’ g |G| _ A’ 8 ~ |G| ~ AB 


(13.5) 


Shortly we will show that a metric of the form (13.4) can indeed be made to 
satisfy the empty-space field equations R jXV = 0 by suitable choice of the functions 
A, B, C, D and to. Before specialising to any particular solution, however, we 
investigate three particularly interesting generic properties of such spacetimes: 
the dragging of inertial frames and the existence of stationary limit surfaces and 
event horizons. 


13.2 The dragging of inertial frames 

The presence of g,^ f 0 in the metric (13.4) introduces qualitatively new effects 
into particle trajectories. Since g jJA , is independent of cb. the covariant component 
/; (/j of a particle’s 4-momentum is still conserved along its geodesic. Indeed 
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= —L, where L is the component of angular momentum of the particle along 
the rotation axis, which is conserved (note the minus sign, which also occurred in 
the Schwarzschild case discussed in Chapter 9). This conservation law is a direct 
consequence of the axisymmetry of the spacetime. Note, however, that the total 
angular momentum of a particle is not a conserved quantity, since the spacetime 
is not spherically symmetric about any point. 

The corresponding contravariant component //■' of the particle’s 4-momentum 
is given by 

p f _ - g ft pt + gMp^, 


and similarly the contravariant time component of the 4-momentum is 

P l = = g tt Pt + g^Pf- 


Let is now consider a particle (or photon) with zero angular momentum, so that 
= 0 along its geodesic. Using the definition of the 4-momentum, for either a 
massive particle or a photon we have 


dt 

ot — and 
da 


6 d( t> 

P a 3“’ 
da 


where cr is an affine parameter along the geodesic and the constants of propor¬ 
tionality in each case arc equal. Thus the particle’s trajectory is such that 


dcj) p$ 
dt p' 


= co(r, 9). 


This equation defines what we mean by co: it is the coordinate angular velocity 
of a zero-angular-momentum particle. 

We shall find the explicit form for co for the Ken - geometry later, but it is 
clear that this effect is present in any metric for which g,^ 0, which in turn 

happens whenever the source of the gravitational field is rotating. So we have the 
remarkable result that a particle dropped ‘straight in’ from infinity ( p (i) = 0) is 
‘dragged’ just by the influence of gravity so that it acquires an angular velocity 
in the same sense as that of the source of the metric. This effect weakens with 
distance (roughly as ~l/r 3 for the Kerr metric) and makes the angular momentum 
of the source measurable in practice. 

The effect is called the dragging of inertial frames. Remember that inertial 
frames arc defined as those in which free-falling test bodies arc stationary or move 
along straight lines at constant speed. Consider the freely falling particle discussed 
above. At any spatial point (r, 9, <p), in order for the particle to be at rest in some 
(inertial) frame the frame must be moving with an angular speed co(r, 9). Any other 
inertial frame is then related to this instantaneous rest frame by a Lorentz transfor¬ 
mation. Thus the inertial frames arc ‘dragged’ by the rotating source. A schematic 
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Figure 13.1 A schematic illustration of the dragging of inertial frames around 
a rotating source. 

illustration of this effect in a plane 9 = constant is shown in Figure 13.1, where 
the spacetime around the source is viewed along the rotation axis. 


13.3 Stationary limit surfaces 

A second generic property of spacetimes outside a rotating source is the existence 
of stationary limit surfaces', this is related to the dragging of inertial frames. This 
effect may be illustrated by considering, for example, photons emitted from a 
position with fixed spatial coordinates (r, 9, <b) in the spacetime. In particular, 
consider those photons emitted in the ±0 directions so that, at first, only dt and 
d(j> are non-zero along the path. Since els 2 = 0 for a photon trajectory, we have 

8tt dt 2 + 2 g t4> dtdf + g^df 2 = 0 , 

from which we obtain 



Now, provided that g tt {r, 9) > 0 at the point of emission, we see that d<b/di is 
positive (negative) for a photon emitted in the positive (negative) (^-direction, as 
we would expect, although the value of d<b/dt is different for the two directions. 
On any surface defined by g tt (r, 9) = 0, however, a remarkable thing happens. 
The two solutions of the above equation in this case are 


df 

dt 



and 



dt 


The first solution represents the photon sent off in the same direction as the source 
rotation, and the second solution corresponds to the photon sent in the opposite 
direction. For this second case, we see that when g tt = 0 the dragging of orbits is so 


13.4 Event horizons 


315 


severe that the photon initially does not move at all! Clearly, any massive particle, 
which must move more slowly than a photon, will therefore have to rotate with 
the source, even if it has an angular momentum arbitrarily large in the opposite 
sense. Any surface defined by g tt (r, 9) = 0 is called a stationary limit surface. 
Inside the surface, where g tt < 0, no particle can remain at fixed (r, 9, <!>) but must 
instead rotate around the source in the same sense as the source’s rotation. This 
is consistent with our discussion of the Schwarzschild metric, for which g lt = 0 
occurs at r — 2 /jl, within which no particle can remain at fixed spatial coordinates. 

The fact that a particle (or observer) cannot remain at a fixed (r, 9, <p) inside a 
stationary limit surface, where g tt < 0, may also be shown directly by considering 
the 4-velocity of an observer at fixed ( r , 9, </>), which is given by 


K] = (r4 0,0,0). 


(13.6) 


We require, however, that u u = g tt (u‘) 2 = c 2 , but this cannot be satisfied if 
g„ < 0, hence showing that a 4-velocity of the form (13.6) is not possible in such 
a region. 

Any surface defined by g tt = 0 is also physically interesting in another way. In 
Appendix 9A, we presented a general approach to the calculation of gravitational 
redshifts. In particular, we showed that, for an emitter E and receiver R with fixed 
spatial coordinates in a stationary spacetime (i.e. one for which h,g jJLl , = 0), the 
gravitational frequency shift of a photon is, quite generally. 


Vr 

V E 


g t M) 


1/2 


U»0B)J 


where A is the event at which the photon is emitted and B the event at which 
it is received. Thus, we see that if the photon is emitted from a point with fixed 
spatial coordinates, then v R —»■ 0 in the limit g tt —> 0, so that the photon suffers 
an infinite redshift. Thus a surface defined by g tt (r, 9) = 0 is also often called 
an infinite redshift surface. This is again consistent with our discussion of the 
Schwarzschild metric, for which the surface r = 2p. (where g tt = 0) is indeed an 
infinite redshift surface. 


13.4 Event horizons 

In the Schwarzschild metric, the surface r = 2/x is both a surface of infinite 
redshift and an event horizon, but in our more general axisymmetric spacetime 
these surfaces need not coincide. In general, as we shall see below, the defining 
property of an event horizon is that it is a null 3-surface, i.e. a surface whose 
normal at every point is a null vector. 
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Before discussing the particular case of a stationary axisymmetric spacetime, 
let us briefly consider null 3-surfaces in general. Suppose that such a surface is 
defined by the equation 

f(xn= o. 

The normal to the surface is directed along the 4-gradient n= V f = <9 / 
(remembering that / is a scalar quantity), and for a null surface we have 

gf v n^n v = 0. (13.7) 

This last property means that the direction of the normal lies in the surface 
itself; along the surface df = n p dx' 1 = 0, and this equation is satisfied when the 
directions of the 4-vectors dx 11 and n 11 coincide. In this same direction, from the 
property (13.7) we see that the element of length in the 3-surface is ds = 0. In 
other words, along this direction the 3-surface is tangent, at any given point, to 
the lightcone at that point. Thus, the lightcone at each point of a null 3-surface 
(say, in the future direction) lies entirely on one side of the surface and is tangent 
to the 3-surface at that point. This means that the (future-directed) worldline of a 
particle or photon can cross a null 3-surface in only one direction, and hence the 
latter forms an event horizon. 

In a stationary axisymmetric spacetime the equation of the surface must take 
the form 


fir, 6 ) = 0. 

Moreover, the condition that the surface is null means that 

*royw)=0’ 

which, for a metric of the form (13.4), reduces to 

g rr (d r f) 2 + g ee (def) 2 = 0- (13.8) 

This is therefore the general condition for a surface f(r, 6) to be an event horizon. 

We may, however, choose our coordinates r and 6 in such a way that we can 
write the equation of the surface as f(r) = 0, i.e. as a function of r alone. In this 
case, the condition (13.8) reduces to 

g rr (d r f) 2 = 0, 


from which we see that an event horizon occurs when g rr = 0, or equivalently 
g rr = oo. This is consistent with our analysis of the Schwarzschild metric, for 
which g rr = oo at r = 2 fi. 
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13.5 The Kerr metric 

So far our discussion has been limited to using symmetry arguments to restrict 
the possible form of the stationary axisymmetric line element, which we assumed 
to be 

ds 2 = g„ dt 2 + 2g t(j> dt d4> + dcj) 2 + g rr dr 2 + g e9 dd 2 (13.9) 
or, equivalently, 

ds 2 = Adt 2 — B(d4> — codt) 2 — C dr 2 — Ddd 2 , (13.10) 

where the arbitrary functions in either form depend only on r and 6. As we have 
seen, the general form of this line element leads to some interesting new physical 
phenomena in such spacetimes. Nevertheless, we must now verify that such a 
line element does indeed satisfy Einstein’s gravitational field equations and thus 
obtain explicit forms for the metric functions appealing in ds 2 . 

The general approach to performing this calculation is the same as that used in 
deriving the Schwarzschild metric. We first calculate the connection coefficients 
P* w for the metric (13.9) or (13.10) and then use these coefficients to obtain 
expressions for the components R flv of the Ricci tensor in terms of the unknown 
functions in the line element. Since we arc again interested in the spacetime 
geometry outside the rotating matter distribution, we must then solve the empty- 
space field equations 

R llv = 0. 

Although this process is conceptually straightforward, it is algebraically very 
complicated, and the full calculation is extremely lengthy. 1 

In fact, one finds that the Einstein equations alone are insufficient to deter¬ 
mine all the unknown functions uniquely. This should not come as a surprise 
since the requirement of axisymmetry is far less restrictive than that of spherical 
symmetry, used in the derivation of the Schwarzschild geometry. Although we 
are envisaging a ‘compact’ rotating body, such as a star or planet, the general 
form of the metric (13.10) would also be valid outside a rotating ‘extended’ 
axisymmetric body, such as a rotating cosmic string. To obtain the Kerr metric, 
we must therefore impose some additional conditions on the solution. It tran¬ 
spires that if we demand that the spacetime geometry tends to the Minkowski 
form as r —> oo and that somewhere there exists a smooth closed convex event 
horizon outside which the geometry is non-singular, then the solution is unique. 


For a full derivation, see (for example) S. Chandrasekhar, The Mathematical Theory of Black Holes, Oxford 
University Press, 1983. 
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In this case, in terms of our ‘Schwarzschild-like’ coordinates (t, r, 6, <p). the line 
element for the Kerr geometry takes the form 


ds 2 = , , _ 5^ j d ,2 + 4nacrsin* 6 ^ <r_ ^ _ 2 Jfjl 

P 2 / P 2 A 

J2 „;„2 


2p.ra~ sin' „ . 

— | r~ + cP H--- ) sin 6 df> , 


(13.11) 


where p. and a arc constants and we have introduced the functions p 2 and A, 
defined by 

p 2 = r 2 + a 2 cos 2 6, A = r 2 — 2/xr + a 2 . 

This standard expression for els 2 is known as the Boyer-Lindquist form and 
(t, r, 6, f>) as Boyer-Lindquist coordinates. The dedicated student may wish to 
verify that this metric does indeed satisfy the empty-space field equations. 

We can write the metric (13.11) in several other useful forms. In particular, it 
is common also to define the function 

£ 2 = (r 2 + a 2 ) 2 — a 1 A sin 2 6 


and write the metric as 


ds~ = 


A. — a 2 sin” 


c 1 dt" + 


- dr" — p~ d6~ — 

A 


Apiar sin“ 6 
P 2 

% 2 sin 2 6 


c dt dcj) 


(13.12) 


dcj) 2 . 


This form can be rearranged in a manner that is more suggestive of a rotating 
object, to give 


ds 1 = c 2 dt 2 


% 2 sin 2 6 


(df ■ 


wdt) 2 - dr — p~ 

A 


dO 1 


(13.13) 


where the physically meaningful function co is given by to = lp.cra/% 2 . 

For later convenience, it is useful to calculate the covariant components 
of the Kerr metric in Boyer-Lindquist coordinates. Using our earlier calculations 
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for the general stationary axisymmetric metric, we find that g rr and g°° are simply 
the reciprocals of g rr and g ee respectively. 



whereas the remaining contravariant components arc given by 

„ t 2 ,a. 2par a 1 sin 2 0 — A 


c 2 p 2 A' 


y t(j) 


2par 
cp 2 A ’ 


p 2 Asin~ 9 


13.6 Limits of the Kerr metric 

We see that the Kerr metric depends on two parameters p and a, as we might 
expect for a rotating body. Moreover, in the limit a 0, 



and so any of the forms for the Kerr metric above tends to the Schwarzschild 
form, 

ds 2 —>• c 2 ^1-—^ dt 2 — ^1-—j dr 2 — r 2 dd 2 — r 2 sin 2 9dcj) 2 . 

Thus suggests that we should make the identification p = GM/c 2 , where M is 
the mass of the body, and also that a corresponds in some way to the angular 
velocity of the body. In fact, by investigating the slow-rotation weak-field limit 
(see Section 13.20), one can show that the angular momentum J of the body 
about its rotation axis is given by / = Mac. 

The fact that the Kerr metric tends to the Schwarzschild metric as a —»■ 0 allows 
us to give some geometrical meaning to the coordinates r and 9 in the limit of a 
slowly rotating body. In the general case, however, r and 9 are not the standard 
Schwarzschild polar coordinates. In particular, from (13.11) we see that surfaces 
t = constant, r = constant do not have the metric of 2 -spheres. 

The geometrical nature of Boyer-Lindquist coordinates is elucidated further by 
considering the Kerr metric in the limit p — 0, i.e. in the absence of a gravitating 
mass, in which case the spacetime should be Minkowski. One quickly finds that, 
in this limit, the line element becomes 

ds 2 = c 2 dt 2 - Tp-—-r dr 2 — p 2 d9 2 — (r 2 + a 2 ) sin 2 9d<f 2 . 

r- + a- 
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This is indeed the Minkowski metric ds 2 = c 2 dt 2 — dx 2 — dy 2 — dz 2 , but written in 
terms of spatial coordinates {r, 6, <fi) that arc related to Cartesian coordinates by 2 

x = V r 2 + a 2 sin 6 cos cf>, 
y = Jr 2 + a 1 sin 6 sin 4>, 
z = r cos 6, 

where r>0, 0<#<77 and 0 < 4> < 2tt (see Figure 13.2). 

In this case (with /jl = 0). the surfaces r = constant arc oblate ellipsoids of 
rotation about the z-axis, given by 

x 2 + y 2 z 2 

- : —I-= 1. 

r 2 + a 2 r 2 

The special case r — 0 corresponds to the disc of radius a in the equatorial plane, 
centred on the origin of the Cartesian coordinates. The surfaces d = constant 
correspond to hyperbolae of revolution about the z-axis given by 

x 2 + y 2 _ z 2 _ 1 

a 2 sin 2 6 a 2 cos 2 6 



The coordinates (r, 6, (j>) are related to the standard oblate spheroidal coordinates (f, 17 , </>) by r = a sinh f 
and 9 = 7] — 7 r/ 2 ; see, for example, M. Abramowitz & I. Stegun, Handbook of Mathematical Functions, 
Dover (1972). 
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The asymptote for large values of r is a cone, with its vertex at the origin, that 
subtends a half-angle 6. The angle $ is the standard azimuthal angle. Clearly, in 
the limit a —> 0 the coordinates (r, 0, f>) correspond to standard spherical polar 
coordinates. It should be remembered, however, that the simple interpretation of 
the coordinates given above no longer holds in the general case of the Kerr metric, 
when /jl ^ 0 . 


13.7 The Kerr-Schild form of the metric 

The form (13.11) for the line element is not in fact the form originally discovered 
by Roy Ken - in 1963. Indeed, Kerr himself followed an approach to the derivation 
very different from that presented here. His original interest was in line elements 
of the general form 

ds 2 = dx lx dx v — hl^l v dx^ dx v , 

where the vector l ,x is null with respect to the Minkowski metric 17 , i.e. 

Vllv n v = o. 


This form for a line element is now known as the Kerr-Schild form. Ken - showed 
that a line element of this form satisfied the empty-space field equations (together 
with our additional conditions on the solution mentioned above), provided that 

2 /rr 3 

r 2 + a 2 z 2 ’ 

r 1 _ / rx + ay ry-ax z\ 

11 \ ’ a 2 + y 2 ’ a 2 + y 2 ’ r) 

where [x M ] = ( t, x, y, z) and r is defined implicitly in terms of x, y and z by 

r 4 - r 2 (x 2 + y 2 + z 2 - a 2 ) - a 2 z 2 = 0. (13.14) 


The corresponding form for the line element is given by 


ds 2 = c 1 dt 2 — clx 2 — dy 2 — dz 2 ■ 


2 jii 3 


r 4 + a 2 z 2 


celt - 


r 2 + a 2 


(x dx + y dy) ■ 


r 2 + a 2 


(x dy — y dx) - dz 


(13.15) 
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It is straightforward, but lengthy, to show that the two forms (13.15) and (13.11) 
for the line element arc identical if the two sets of coordinates arc related by 

2 ptr 

cdt = cdt - dr, 

A 

(13.16) 

x = (r cos 4>' + a sin <fi r ) sin 6, 

(13.17) 

y = (r sin 4>' — a cos 4>') sin 6, 

(13.18) 

z = r cos 0 . 

(13.19) 

where d(j)' = dcfr— (a/ A) dr. 



13.8 The structure of a Kerr black hole 

The Kerr metric is the solution to the empty-space field equations outside a 
rotating massive object and so is only valid down to the surface of the object. 
As in our discussion of the Schwarzschild solution, however, it is of interest to 
consider the structure of the full Kerr geometry as a vacuum solution to the field 
equations. 


Singularities and horizons 

The Kerr metric in Boyer-Lindquist coordinates is singular when p = 0 and when 
A = 0. Calculation of the invariant curvature scalar R fJLV(7 pR p ' v<rp reveals that only 
p = 0 is an intrinsic singularity. Since 

p 2 = r 2 + a 2 cos 2 0 = 0 , 

it follows that this occurs when 

r = 0, 6 = 77 / 2 . 

From our earlier discussion of Boyer-Lindquist coordinates, we recall that r = 0 
represents a disc of coordinate radius a in the equatorial plane. Moreover, the 
collection of points with r = 0 and 6 = 7 t/2 constitutes the outer edge of this disc. 
Thus, rather surprisingly, the singularity has the form of a ring, of coordinate 
radius a, lying in the equatorial plane. Similarly, using (13.14) and (13.19), we 
see that, in terms of the ‘Cartesian’ coordinates (1, x, y, z), the singularity occurs 
when x 2 + y 2 = a 2 and z = 0 . 

The points where A = 0 are coordinate singularities, which occur on the surfaces 

1/2 2\!/2 

r ± = p, ± [pi — a ) 


(13.20) 





13.8 The structure of a Kerr black hole 


323 


As discussed above, event horizons in the Kerr metric will occur where r = 
constant is a null 3-surface, and this is given by the condition g rr = 0 or, equiva¬ 
lently, g rr = oo. From (13.11), we have 



from which we see that the surfaces r = r + and r = r_, for which A = 0, arc 
in fact event horizons. Thus, the Ken - metric has two event horizons. In the 
Schwarzschild limit a 0, these reduce to r = 2/x and r = 0. The surfaces r = r ± 
are axially symmetric, but their intrinsic geometries are not spherically symmetric. 
Setting r = r ± and t = constant in the Kerr metric and noting from (13.20) that 
r± + a 2 = 2j±r ± . we obtain two-dimensional surfaces with the line elements 

da 2 = p\ dd 2 + ^ ^ sin 2 d elf) 2 , (13.21) 

which do not describe the geometry of a sphere. If one embeds a 2-surface with 
geometry given by (13.21) in three-dimensional Euclidean space, one obtains a 
surface resembling an axisymmetric ellipsoid, flattened along the rotation axis. 

The existence of the outer horizon r = r + , in particular, shows that the Kerr 
geometry represents a (rotating) black hole. It is a one-way surface, like r = 2/x 
in the Schwarzschild geometry. Particles and photons can cross it once, from the 
outside, but not in the opposite direction. It is common practice to define three 
distinct regions of a Kerr black hole, bounded by the event horizons, in which the 
solution is regular: region I, r + < r < oo; region II, r_ < r < r + ; and region III, 
0 <r <r_. 

Not all values of /x and a correspond to a black hole, however. From (13.20), 
we see that horizons (at real values of r ) exist only for 

a 2 < /x 2 . (13.22) 

Thus the magnitude of the angular momentum J = Mac of a rotating black hole 
is limited by its squared mass. Moreover, if the condition (13.22) is satisfied 
then the intrinsic singularity at p = 0 is contained safely within the outer horizon 
r = r + . An extreme Kerr black hole is one that has the limiting value a 2 = p, 2 . 
In this case, the event horizons r + and r_ coincide at r = /x. It may be that 
near-extreme Kerr black holes develop naturally in many astrophysical situations. 
Matter falling towards a rotating black hole forms an accretion disc that rotates 
in the same sense as the hole. As matter from the disc spirals inwards and falls 
into the black hole, it carries angular momentum with it and hence increases the 
angular momentum of the hole. The process is limited by the fact that radiation 
from the infalling matter carries away angular momentum. Detailed calculations 
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suggest that the limiting value is a ~ 0.998 pt, which is very close to the extreme 
value. 

For a 2 > ji 2 we find that A > 0 throughout, and so the Kerr metric is regular 
everywhere except p = 0, where there is a ring singularity. Since the horizons 
have disappeared, this means that the ring singularity is visible to the outside 
world. In fact, one can show explicitly that timelike and null geodesics in the 
equatorial plane can start at the singularity and reach infinity, thereby making 
the singularity visible to the outside world. Such a singularity is called a naked 
singularity (as mentioned in Section 12.6) and opens up an enormous realm for 
some truly wild speculation. However, Penrose’s cosmic censorship hypothesis 
only allows singularities that arc hidden behind an event horizon. 


Stationary limit surfaces 

As we showed earlier, in a general stationary axisymmetric spacetime the condi¬ 
tion g tt = 0 defines a surface that is both a stationary limit surface and a surface 
of infinite redshift. For the Kerr metric, we have 


8tt = 



-2p,r + n 2 cos 2 6 


so that (for a 2 < pi 2 ) these surfaces, S + and S , occur at 


r s ± = pt ± [pt 2 — a 2 cos 2 9) . 


The two surfaces arc axisymmetric, but setting r = r s ± and t = constant in the 
Kerr metric, and noting from (13.20) that r 2 . + a 2 = 2ptr s ± + a 2 sin 2 0, we obtain 
two-dimensional surfaces with line elements 


da 2 = p 2 ± dd 2 + 


2ptr s ±(2pir s ± +2 a 2 sin 2 9) 

P 2 S ± 


sin 2 9 d<fi 2 


(13.23) 


which again do not describe the geometry of a sphere. If one embeds a 2-surface 
with geometry given by (13.23) in three-dimensional Euclidean space then a 
surface resembling an axisymmetric ellipsoid, flattened along the rotation axis, is 
once more obtained. In the Schwarzschild limit a -> 0, the surface S + reduces to 
r = 2/z and S~ to r = 0. As anticipated we see that, in the Schwarzschild solution, 
the surfaces of infinite redshift and the event horizons coincide. 

The surface S~ coincides with the ring singularity in the equatorial plane. 
Moreover, S~ lies completely within the inner horizon r = r_ (except at the poles, 
where they touch). The surface S + has coordinate radius 2/i at the equator and, 
for all 9 , it completely encloses the outer horizon r = r + (except at the poles, 
where they touch), giving rise to a region between the two called the ergoregion. 
The structure of a Kerr black hole is illustrated in Figure 13.3. 
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Figure 13.3 The structure of a Kerr black hole. 

The ergoregion 

The ergoregion gets its name from the Greek word ergo meaning work. The 
key property of an ergoregion (which can occur in other spacetime geometries) 
is that it is a region for which g tt < 0 and from which particles can escape. 
Clearly, the Schwarzschild geometry does not possess an ergoregion, since g tt < 0 
is only satisfied within its event horizon. As we will discuss in Section 13.9, 
Roger Penrose has shown that it is possible to extract the rotational energy of 
a Kerr black hole from within the ergoregion. To assist in that discussion, it is 
useful here to consider the constraints induced by the spacetime geometry on the 
motion of observers within the ergoregion. 

Since g u < 0 at all points within the ergoregion, an immediate consequence (as 
already discussed in Section 13.3) is that an observer (even in a spaceship with 
an arbitrarily powerful rocket) cannot remain at a fixed (r, 9, <b) position. The 
4-velocity of such an observer would be given by 

K] = (i/, 0,0,0), (13.24) 

but the requirement that uu = g fr (i/) 2 = c2 cannot be satisfied if g tt < 0, showing 
that a 4-velocity of the form (13.24) is not possible. 

It is possible, however, for a rocket-powered observer to remain at fixed r 
and 9 coordinates by rotating around the black hole (with respect to an observer 
at infinity) in the same sense as the hole’s rotation; this is an illustration of the 
frame-dragging phenomenon discussed in Section 13.3. The 4-velocity of such an 
observer is 


K] = w/(1,0,0,0), 


(13.25) 
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where 12 = dcj)/dt is his angular velocity with respect to the observer at infinity. 
For any particular values of r and 9, there exists a range of allowed values for 11, 
which we now derive. We again require u u — g /JLV u IJ 'u v = c 2 and, using (13.25), 
this condition becomes 

8ttW) 2 + 2g t()> u t u 4> + g^u*) 2 = ( u t ) 2 {g tt + 2g t(jl VL + g H Dr) = c 2 . (13.26) 

Thus, for it 1 to be real we require that 

g<j><t>£l 2 + + g tt > 0. (13.27) 

Since g ^ < 0 everywhere, the left-hand side of (13.27) as a function of ft gives 
rise to an upward pointing parabola. Thus, the allowed range of angular velocities 
is given by ft_ < ft < ft + , where 



(13.28) 


There arc clearly two special cases to be considered. First, when g tt = 0 we 
have ft_ = 0 and ft + = 2 co. This occurs on the stationary limit surface r = r s +, 
which is the outer defining surface of the ergoregion. The lower limit ft_ = 0 is 
precisely the physical meaning of a stationary limit surface: within it an observer 
must rotate in the same direction as the black hole and so ft must be positive. 
For larger values of r, however, ft can be negative. The second special case to 
consider is when co 2 = g tt /g^, in which case ft ± = co. Thus, at points where this 
condition holds, every observer on a circular orbit is forced to rotate with angular 
velocity ft = co. Where (if anywhere) does this condition hold? Upon inserting 
the appropriate expressions for co, g tt and g^ from the Kerr metric (13.13) into 
(13.28), one finds, after some careful algebra, that our condition holds where 
A = 0, i.e. at the outer event horizon r = r + , which is the inner defining surface 
of the ergoregion. 

Putting our results together we find that, for an observer at fixed r and 6 
coordinates within the ergoregion, the allowed range of angular velocities ft_ < 
ft < ft + becomes progressively narrower as the observer is located closer and 
closer to the horizon r = r + , and at the horizon itself the angular velocity is 
limited to the single value 


ft H = co(r + , 9) = 


ac 

2 /ir + ’ 


(13.29) 
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which is, in fact, independent of 6. We also note that ft H is the maximum 
allowed value of the angular velocity for any observer at fixed r and 6 within the 
ergoregion. 


Extension of the Kerr metric 

So far we have not discussed the disc region interior to the ring singularity. 
Although beyond the scope of our discussion, it may be shown that if a particle 
passes through the interior of the ring singularity then it emerges into another 
asymptotically flat spacetime, but not a copy of the original one. The new space- 
time is described by the Kerr metric with r < 0 and hence A never vanishes, so 
there are no event horizons. 3 

In the new spacetime, the region in the vicinity of the ring singularity has the 
very strange property that it allows the existence of closed timelike curves. For 
example, consider a trajectory in the equatorial plane that winds around in <fi 
while keeping t and r constant. The line element along such a path is 


ds 2 


r- + a- + 



dcfr. 


which is positive if r is negative and small. These arc then closed timelike curves, 
which violate causality and would seem highly unphysical. If they represent 
woridlines of observers, then these observers would travel back and meet them¬ 
selves in the past! It must be remembered, however, that the analytic extension 
of the Kerr metric to negative values of r is subject a number of caveats and 
may not be physically meaningful. It seems highly improbable that in practice 
the gravitational collapse of a real rotating object would lead to such a strange 
spacetime. 


13.9 The Penrose process 

We now discuss the Penrose process, by which energy may be extracted from 
the rotation of a Kerr black hole (or, indeed, from any spacetime possessing 
an ergoregion). Suppose that an observer, with a fixed position at infinity, for 
simplicity, fires a particle A into the ergoregion of a Kerr black hole. The energy 
of particle A, as measured by the observer at the emission event £, is given by 

E A)^ p A) (£) .u oh ^p ( t A) (£), (13.30) 

where p {A] (£) is the 4-momentum of the particle at this event and n obs is the 
4-velocity of the observer, which has components [n^ bs ] = (1,0, 0, 0). 


In the extended Kerr solution it is common to define region III to cover the coordinate range — < 


r < r_. 
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Suppose now that, at some point in the ergoregion, particle A decays into two 
particles B and C. By the conservation of momentum, at the decay event T> 
one has 

p (A \!D)=p (B \D)+p ic) (D). (13.31) 

If the decay occurs in such a way that particle C (say) eventually reaches infinity, 
a stationary observer there would measure the particle’s energy at the reception 
event 31 to be 

E^ = p ( t C \x) = pf\D), 

where, in the second equality, we have made use of the fact that the covariant time 
component of a particle’s 4-momentum is conserved along geodesics in the Kerr 
geometry, since the metric is stationary (<9,g /J( , = 0). Similarly, for the original 
particle we have p\ A) (D) = p\ A \£). Thus, the time component of the momentum 
conservation condition (13.31) may be written in the form 

£ (c) = E (A) -p\ B \v), (13.32) 

where p\ B) is also conserved along the geodesic followed by particle B. 

The key step is now to note that = e, ■ p iB K where e, is the /-coordinate 
basis vector, whose squared ‘length’ is given by 


e t' e t — 8tt- 

If particle B were ever to escape beyond the outer surface of the ergoregion, 
i.e. to a region where g tt > 0, then e, would be timelike. Thus, p\ B) would be 
proportional to the particle energy as measured by an observer with 4-velocity 
along the ^-direction. In this case p\ B) must therefore be positive, and so (13.32) 
shows that E (c ^ < E (a \ i.e. we get less energy out than we put in. However, if 
the particle B were never to escape the ergoregion but instead fall into the black 
hole, then it would remain in a region where g tt < 0 and so e, is spacelike. In this 
case p\ B) would be a component of spatial momentum, which might be positive 
or negative. For decays where it is negative, from (13.32) we see that E {< ) > E (a) 
and so we have extracted energy from the black hole. This is the Penrose process. 

What are the consequences of the Penrose process for the black hole? Once the 
particle has fallen inside the event horizon, the mass M and angular - momentum 
J = Mac of the black hole are changed: 

M —► M + p (B) /c 2 , (13.33) 

(13.34) 

where in the last equation we must remember that, for particle orbits in general, 
is minus the component of angular - momentum of the particle along the rotation 
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axis of the black hole. From (13.33), we see that the negative value of p\ B ^ for 
the infalling particle in the Penrose process reduces the total mass of the black 
hole. As we now show, however, the Penrose process also reduces the angular 
momentum of the black hole. This is what is meant by saying that the Penrose 
process extracts rotational energy from the black hole. 

To show that the angular momentum of the black hole is reduced by the 
infalling particle, it is useful to consider an observer in the ergoregion at fixed 
r and 9 coordinates, who observes the particle B as it passes him. As shown in 
Section 13.8, the 4-velocity of such an observer is 

[«'*] = «'(1,0,0, FI), (13.35) 


where 11 = d4>/clt is the observer’s angular velocity with respect to infinity. This 
observer would measure the energy of particle B to be 

E iB) = - u‘ (p^+p^ny 

Since this energy must be positive, we require 

(B) 

L< ^k (1336) 

where L = —p\ B) is the component of the angular momentum of the particle along 
the rotation axis of the hole. Since p \ B * is negative in the Penrose process and FI 
must be positive for an observer in the ergoregion, we see that L < 0. Thus the 
infalling particle must have negative angular momentum, which therefore reduces 
the net angular momentum of the black hole. Rotational energy can continue to 
be extracted until the angular momentum of the black hole is reduced to zero and 
it becomes a Schwarzschild black hole. 

We can, in fact, go slightly further and set a strict upper limit on L (which, since 
L is negative, is equivalent to a lower limit on its magnitude). We actually require 
(13.36) to hold for any observer at fixed r and 9 in the ergoregion. From our 
earlier discussion of the ergoregion, the maximum value of the angular velocity 
occurs for an observer at the horizon r = r + , in which case LI = O h , (13.29). 
Thus, denoting the changes in the mass and angular momentum of the black hole 
by SM and 8J respectively, the condition (13.36) becomes 


SJ < 


c 2 8M 

FI h 


where it should be remembered that both 8M and 8J arc negative. 




330 


The Kerr geometry 

13.10 Geodesics in the equatorial plane 


As one might expect, the general equations for non-null and null geodesics in the 
Kerr geometry are much less tractable than in the Schwarzschild case, and particle 
trajectories exhibit complicated behaviour. For example, in general the trajectory 
of a massive particle or photon is not constrained to lie in a plane. This is a direct 
consequence of the fact that the spacetime is not spherically symmetric and so, in 
general, the angular momentum of a test particle is not conserved. Since the Kerr 
geometry is stationary and axisymmetric, the conserved quantities along particle 
trajectories arc p t and p^. The latter corresponds to the conservation of only the 
component of angular momentum along the rotation axis. Nevertheless, since the 
metric is reflection-symmetric through the equatorial plane, particles for which 
p B = 0, i.e. which arc initially in the equatorial plane, will always have p 11 = 0 and 
so the trajectory remains in this plane. We shall therefore confine our attention to 
this simpler special case. 

Setting d = tt/2 in the Kerr metric (13.11), we obtain 


ds 1 = c 1 


2a 

1 - — \ dt 


Aaac r 2 0 

-|- dtdcj) —— dr~ — 

r A 


: + a 2 + ^ W 


(13.37) 

from which the covariant metric components g in the equatorial plane can 
be read off. Following the method described in Section 13.1, the corresponding 
contravariant metric components arc found to be 


A 

72 ’ 


= -i(i-^ 

A 


From (13.37) one can immediately write down the corresponding ‘Lagrangian’ 
L = g fJLV xf x x v . In the interests of notational simplicity, for a massive particle we 
shall take the particle to have unit rest mass and for a photon we shall choose 
an appropriate affine parameter along the null geodesic such that, in both cases, 
p lx = iT. One may obtain the geodesic equations by writing down the appropriate 
Euler-Lagrange equations. It is quicker, however, simply to use the fact that p, and 
p ^ arc conserved along geodesics (since the metric does not depend explicitly on 
t and </)), which leads immediately to the first integrals of the t- and cj)- equations. 
These arc given by 

Pt = gJ + St^ = c 2 ^1 - ^j • + _ kc 2, (13.38) 

Pt/} = gfri + g<M,4> = (V + a 2 + 4>=-h, (13.39) 
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where we have defined the constants k and h so that in the Schwarzschild limit 
a —► 0 they coincide with the constants introduced in Chapter 9. This pair of 
simultaneous equations for / and <p is straightforwardly solved to give 


(13.40) 


Instead of using the complicated Euler-Lagrange equation for r, we may use 
the first integral provided by the invariant length of the 4-momentum p. Since 
the covariant components of p arc particularly simple, the most convenient form 
to use is g ,JV p IJL p v = e 2 , where e 2 = c 1 for a massive particle and e 2 = 0 for a 
photon. 4 Since Pq = 0 this gives 

g"( Pt ) 2 + 2 8 t <l>p tP4> + g (l>(t ’(Pt) 2 + g rr (p r ) 2 = e 2 , (13.41) 

where, for the moment, it is simpler not to write out the contravariant metric 
components in full. By substituting Pt = kc 2 and = —h into (13.41) and 
remembering that p r = g n .r and g rr = 1 /g rr , we may then obtain the ‘energy’ 
equation for equatorial trajectories, which gives r in terms of only r and a set of 
constants. This yields 

r 2 = g rr (e 2 - g n c 4 k 2 + 2 g^c 2 kh - g H h 2 ). (13.42) 

At this stage, we may (if we wish) substitute the explicit forms for the contravariant 
metric coefficients to obtain 

(13.43) 

In the limit a —* 0, the energy equation reduces to those derived in Chapter 9 
for massive-particle (e 2 = c 2 ) and photon (e 2 = 0) orbits in the Schwarzschild 
geometry. 

Since we arc restricting our attention to the equatorial plane, we need not 
consider the Euler-Lagrange equation for 6, since it will not yield an independent 
equation of motion. Thus, equations (13.40) and (13.43) completely determine 
the null and non-null geodesics in the equatorial plane for given values of the 
constants k and h. 




The device of working in terms of e 2 allows one to calculate the null and non-null geodesic equations 
simultaneously; one simply sets e 2 to the appropriate value at the end of the calculation. 
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The null and non-null geodesics in the equatorial plane can be delineated in 
much the same way as for the Schwarzschild case in Chapter 9, albeit requiring 
significantly more complicated algebra. Before moving on to discuss particu¬ 
lar examples, however, it is worth noting two essential differences from the 
Schwarzschild case. First, in the Ken - equatorial geometry trajectories will depend 
upon whether the particle or photon is in a co-rotating (prograde) or counter¬ 
rotating (retrograde) orbit, i.e. rotating about the symmetry axis in the same sense 
or the opposite sense to that of the rotating gravitational source. Second, both t 
and cj> are ‘bad’ coordinates near the horizons. Expressed in terms of these coor¬ 
dinates, a trajectory approaching an horizon (at r + or r_) will spiral around the 
black hole an infinite number of times, just as it takes an infinite coordinate time 
t to cross the horizon; neither behaviour is experienced by an observer comoving 
with the particle. 


13.11 Equatorial trajectories of massive particles 

For a massive particle, the timelike geodesics in the equatorial plane arc governed 
by (13.40), and the ‘energy’ equation (13.43) with e 2 = c 2 , which reads 


= c 2 (k 2 -1) + 


2 lie 2 a 2 c 2 (k 2 — 1)- 


+ 


h 2 2 fx(h - 


ack) 2 


(13.44) 


The interpretation of the constants k and h may be obtained by considering the 
limit r oo, in the same way as for the Schwarzschild geometry. One thus finds 
that /cc 2 and h arc, respectively, the energy and angular momentum per unit rest 
mass of the particle describing the trajectory. 

One may rewrite the energy equation (13.44) in the form 


\r 2 + Vg ff (r; h, k) = ±c 2 (k 2 - 1), 


(13.45) 


where we have identified the effective potential per unit mass as 
~ 2 hr — a 2 c 2 {k 2 — 1) ii(h-ack) 2 


V e ff(r; h, k ) — — —-1- ■ 


2 r 2 


(13.46) 


There arc several points to note here. First, V eff has the same / -dependence as 
the corresponding expression for the Schwarzschild case, derived in Chapter 9; 
it is only that the coefficients of the last two terms arc more complicated in the 
Kerr case. The graph of V ei f therefore has the same general shape as those shown 
in Figure 9.4. Indeed, as one would expect, in the limit a -r 0 equation (13.46) 
reduces to the Schwarzschild result. When a ^ 0, however, one must be careful 
in interpreting (13.46) as an effective potential, since it depends on the energy k 
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of the particle (as well as the usual dependence on the angular momentum h). 
Nevertheless, by differentiating (13.45) with respect to r, one finds that the radial 
acceleration of a particle is still given by r = —dV e(( /dr. Similarly, the stability of 
circular orbits, for example, may be deduced by considering the sign of d 2 V^/dr 2 
in the usual manner. Also, an incoming particle will fall into the black hole only 
if the parameters h and k defining its trajectory arc such that the maximum value 
of V e ff(r; h, k ) exceeds ^c 2 (k 2 — 1). 

In our discussion of the Schwarzschild geometry in Chapter 9, in addition to the 
energy equation it was reasonably straightforward also to derive the ‘shape’ equa¬ 
tion for a general massive particle orbit and, equivalently, a simple expression for 
d(j)/dr. Unfortunately, it is algebraically very complicated (and unilluminating) 
to obtain the equivalent expressions for the Kerr geometry, even in the case of 
equatorial orbits. It is therefore natural to confine our attention to special cases 
in which the symmetry of the orbit makes the algebra more manageable. We are 
once again unfortunate, however, since the Kerr solution does not admit radial 
geodesics (either null or non-null). In a loose sense, the reason is that the rotating 
object ‘drags’ the surrounding space and the geodesics with it. Nevertheless, it is 
still reasonably straightforward to consider motion with zero angular momentum 
and motion in a circle. 


13.12 Equatorial motion of massive particles with zero angular momentum 


For a particle falling into a Kerr black hole whose angular momentum about 
the black hole is zero, we have h = 0. Setting h = 0 reduces the complexity of 
the geodesic equations (13.40-13.44) somewhat. To simplify the equations still 
further, however, we will also consider the limit in which the particle starts at 
rest from infinity, in which case k = 1. In this case the particle will initially be 
moving radially. 

Using these values of h and k, the geodesic equations become 

2 fia 2 ' 


1 

1 ~ A 


r~ + a + ■ 


4 > 


2 fiac 
rA 



From these expression, we see that both / and cj> are infinite at the horizons (when 
A = 0), which is an illustration of the fact that both t and cb are ‘bad coordinates’ 
in these regions. Interestingly, the singular behaviours of the t and <b coordinates 
‘cancel’ in the expression for r 2 . 
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The above equations may in turn be used to obtain expressions relating differ¬ 
entials of the coordinates along the particle trajectory. In particular, we find that 


dr r 
dt t 




r~ + a + 



-l 


dcj) 

dt 


2/xac 


r + a + 


2 f±a~ 


d4> 4> 

dr r 


2/jlci 

rA 


2/jl 



We note that both dt/dr and dcf>/dr become infinite at the horizons (when 
A = 0), but dcj)/dt remains finite there. The above equations may be integrated 
numerically in a straightforward way to obtain the trajectory of the massive 
particle. In Figure 13.4, we plot such a trajectory in the (ct, r)-plane and in the 
(x, y)-plane (where x = V r 2 + a 2 cos ([> and y = Jr 2 + a 2 sin 4>) for a particle 
that passes through the point (r, <j>) = (8/x, 0) at t = 0 in a Ken - geometry with 
rotation parameter a = 0.8 /x. In particular, we note that both plotted curves have 



Figure 13.4 The trajectory of an initially radially moving massive particle falling 
from rest at infinity in a Kerr geometry with rotation parameter a — 0.8p.. The 
trajectory (solid line) is plotted in the (ct, r)-plane (left) and the (. x , y)-plane 
(right), where x — r 2 +a 2 cos (f) and y — Vr 2 + a 2 sin <fi. The locations of the 
horizons (broken lines) and ring singularity (dotted line) are also indicated. The 
points correspond to unit intervals of ct//jl, where r is the proper time and we 
have taken r = t = 0 at r = 8/r. 
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discontinuities at the horizons, which are shown as broken lines. This illustrates 
the pathology of the t- and (^-coordinates in these regions. The points in each plot 
correspond to unit intervals of cr/p, where t is the proper time and we have taken 
t = 0 at r = Sp. The proper time increases steadily, without becoming singular; 
the particle reaches the ring singularity in the equatorial plane at ct//jl = 10.2. In 
the right-hand plot we also note the effect of frame-dragging on the trajectory of 
the initially radially moving particle. 


13.13 Equatorial circular motion of massive particles 

For circular motion, we require that r = 0 and, for the particle to remain in a 
circular orbit, that the radial acceleration r must also vanish. In terms of the 
effective potential defined in (13.46), for a circular orbit at r = r c we thus require 


V e ff(r- h, k ) = \c 2 {k 2 — 1) and — 


= 0 . 


(13.47) 


These two equations determine the values of the constants k and h that correspond 
to a circular orbit at some assigned radius r = r c . 

Obtaining analytic expressions for k and h is, algebraically, considerably more 
complicated than in the Schwarzschild case. The derivation is simplified some¬ 
what, however, by working in terms of u = 1 /r. Making this substitution into 
(13.46) and then differentiating the resulting expression with respect to u, we 
find that 


— pc 2 u + \[h 2 — a 2 c 2 (k 2 — 1)]m 2 — p(h — ack) 2 u 3 = ^ c 2 (k 2 — 1), 

—lie 2 + [ h 2 — a 2 c 2 (k 2 — 1)]« — 3 p(h — ack) 2 u 2 = 0, 

where the second equality holds since dV^/dr = (dV ei { / du)(du/dr), and there¬ 
fore dV e ff/du = 0 implies that dV clf /dr = 0. The algebra is further eased by 
introducing the variable x = h — ack, so that the two equations above become 

— fic 2 u + ^{x 2 + 2ackx +a 2 c 2 )u 2 — fix 2 u 3 = \c 2 (k 2 — 1), (13.48) 

— ptc 2 + {x 2 + 2ackx +a 2 c 2 )u — 3p,x 2 u 2 = 0. (13.49) 

Subtracting (13.48) from u times (13.49) and performing a simple rearrangment 
of (13.49), we obtain 

c 2 k 2 = c 2 ( 1 — pm) + fix 2 ii 3 , 

2 xacku = x 2 u(3fiu — 1) — c 2 (a 2 u — p,). 


(13.50) 

(13.51) 
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These are the basic equations to be used for obtaining analytic expressions for 
the constants k and /?. 

Eliminating k between (13.50) and (13.51), we quickly obtain a quadratic 
equation for x 2 , 

u 2 [(3 ilu — l) 2 — 4 < 7 2 / xm 3 ] x 4 — 2c 2 u [(3 fiu — 1 )(a 2 u — fi) — 2ua 2 (fiu — 1)] x 2 

+ c 4 (a 2 u — fi) 2 = 0. 


Using the standard formula for the roots of a quadratic equation, one finds (after 
some straightforward but substantial algebra) that 

(B52) 

u{ 1 — 2>ixu^2ayJ fiu 3 ) 

As we shall see below, the upper signs corresponds to the counter-rotating circular 
orbit and the lower signs to the co-rotating one. Furthermore, in order to obtain 
x we must choose either the positive or negative square root of (13.52). As we 
might expect from our discussion of the stability of massive particle orbits in the 
Schwarzschild case (see Chapter 9), the possibility exists for a circular orbit at 
a given coordinate radius to be either stable or unstable. It is straightforward to 
show that it is the negative root of (13.52) that corresponds to the stable case, in 
which we are most interested. We therefore consider only the solution 


c(ay/u± y/jl) 

[m( 1 — 3/iu^2ay/ fiu 2 )] 1 / 2 


(13.53) 


Inserting this solution into (13.50), for a stable circular orbit of inverse coordinate 
radius u we find that 


1 — 2(jlu ay/ fiu 3 

(1 — 3 fiu 2 ay/ /am 3 ) 1 / 2 


(13.54) 


the energy of a particle of rest mass m 0 being E = km 0 c 2 . The corresponding 
value of the specific angular momentum for the orbit is obtained by calculating 
h = x + ack, which gives 


h = T 


CyffL{ 1 + a 2 u 2 zb 2ay/[ jlu 3 ) 
y/u( 1 — 2jJuU^2ay/ fiu 3 ) 1 / 2 


(13.55) 


We note that, as expected, in the limit a —> 0 the expressions (13.54) and (13.55) 
reduce to the corresponding results in the Schwarzschild case derived in Chapter 9. 
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13.14 Stability of equatorial massive particle circular orbits 

It is worthwhile considering in some detail the stability of equatorial massive- 
particle orbits. Of particular astrophysical interest is the stability of the circular 
orbits discussed above and, especially, the coordinate radius of the innermost 
stable circular orbit in the co- and counter-rotating cases. 

For a circular orbit of coordinate radius r = r c we require the condition (13.47). 
In addition, for marginal stability we require that (for r — r c ) 

d 2 V e n = d 2 V eif / du y dV eS d 2 u = 3 / d 2 V efi = 

dr 2 du 2 \dr) du dr 2 \ du 2 du ) 

where it = 1 /r. Since dV eii /du = 0 for a circular orbit, this additional requirement 
amounts to d 2 V eff /du 2 = 0. From (13.49), this reads 

x 2 + 2 ackx + a 2 c 2 — 6pix 2 u = 0, 
which may be more conveniently written as 

x 2 +2 ackx + a 2 c 2 h 2 — a 2 c 2 (k 2 — 1) 

6px 2 6px 2 

Inserting the expressions (13.53-13.55) for x, k and h respectively into this 
equation and simplifying, one finds that 

1 — 3 a 2 u 2 — 6w/x Sets/ pu 2 = 0. 

Finally, using u = 1/r, one obtains an implicit equation for the coordinate radius 
r of the innermost stable circular orbit, 

(13.56) 

where, once again, the upper sign corresponds to the counter-rotating orbit and 
the lower sign to the co-rotating orbit. In the limit a = 0, we see that we recover 
r = 6p for the innermost stable circular orbit in the Schwarzschild case. In the 
extreme Kerr limit a = /jl wc find, by inspection, that r = 9/r for the counter¬ 
rotating orbit and r = /jl for the co-rotating case. 

The general solution to the above quartic equation in can be found analyti¬ 
cally by standard methods, but the resulting expressions arc algebraically messy. 
It is more instructive instead to solve the equation numerically and plot the results 
for a range of a/p values, as shown in Figure 13.5 (left-hand panel). Also of 
particular interest is the energy £ of a particle in the innermost co- and counter¬ 
rotating stable circular orbits. Using the expression (13.54), in the right-hand 
panel of Figure 13.5 we plot k = E/(m () c 2 ) for these orbits as a function of a/p. 

The difference between the energy E = km Q c 2 of a particle in an orbit and the 
energy m 0 c 2 of the particle at rest at infinity is the gravitational binding energy 
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a/^i a/fj, 


Figure 13.5 The scaled coordinate radii r //i (left) and the constant ks — 
E/(m 0 c 2 ) (right) for the innermost stable co-rotating and counter-rotating circu¬ 
lar orbits in the equatorial plane of the Kerr geometry, as functions of a// a. 


of the orbit. As discussed in Chapter 9, the binding energy of a particle in an 
accretion disc around a compact object can be released. As the particle loses 
angular momentum, owing to turbulent viscosity, it gradually moves inwards, 
releasing gravitational energy mostly as radiation, until it reaches the innermost 
stable circular orbit, at which point it spirals rapidly inwards onto the compact 
object. The efficiency e acc of the accretion disc is the fraction of the rest mass 
energy that can be released in making the transition from rest at infinity to 
the innermost stable circular orbit and is given by e acc = 1 — k. We see from 
Figure 13.5 (right-hand panel) that for all values of a/ 1 ± the co-rotating orbit is 
the more bound, and the corresponding binding energy is greatest for an extreme 
Kerr black hole (a/fi = 1). In this case 


1 



42%, 


and so an accretion disc around such an object could convert nearly one-half 
of the rest mass energy of its constituent particles into radiation. For a realistic 
astrophysical Kerr black hole that has been ‘spun-up’ by the accretion process one 
expects that a/ 1 ± *=» 0.998, in which case e acc ~ 32%, which is still substantially 
larger than the value of 5.7% in the Schwarzschild case. 


13.15 Equatorial trajectories of photons 

For photons, the null geodesics in the equatorial plane are governed by (13.40) 
and the ‘energy’ equation (13.43) with s 2 = 0, which reads 


. 2 9 2 a 2 c 2 k 2 — h 2 2/ji(h — ack) 2 

r — c k -\ - I — 


(13.57) 
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As in our discussion of photon trajectories in the Schwarzschild geometry in 
Chapter 9, it is useful to introduce the parameter b = h/(ck). By considering the 
limit r —> oo, one again finds that b may be interpreted as an impact parameter 
for the trajectories that extend to infinity. We note that for r — oc the constant k 
is positive and so b (or h) has the same sign as cj) in this limit. 

One may rewrite the energy equation (13.57) in the form 


h 2 


+ ^eff( r > b) ~ 


1 

b 2 ’ 


where we have identified the effective potential as 


V e ff(c b) = 


1 

r 2 





(13.58) 


(13.59) 


As was the case for massive particles, V eff has the same r-dependence as the 
corresponding expression for the Schwarzschild case, derived in Chapter 9, and 
so the graph of V eff has the same general shape. Indeed, as one would expect, 
in the limit a —»■ 0 equation (13.59) reduces to the Schwarzschild result. When 
a ^ 0, however, one must again be careful in interpreting (13.59) as an effective 
potential, since it depends on the value b (and hence k) of the particle trajectory. 
Nevertheless, by differentiating (13.58) with respect to r, one finds that the radial 
acceleration of a particle is still given by r = —h 2 dV ct j/dr. (In fact, by a rescaling 
A —> /(A of the affine parameter A, the explicit ^-dependence is removed from 
this result and (13.58).) Similarly, the stability of a circular orbit, for example, 
may be deduced by considering the sign of d 2 V ctt /dr 2 in the usual manner. 


13.16 Equatorial principal photon geodesics 

As might be expected, radial photon geodesics do not exist in the equatorial plane 
of the Kerr geometry. Nevertheless, we can obtain information about the radial 
variation of the light-cone structure by investigating the principal null geodesics. 
These are defined by the condition b = a. The system of equations (13.38), 
(13.39), (13.57) then reduces to 

i = (r 2 + a 2 )k/ A, 
t\b = ack/A, 
r = ±ck, 

where the plus sign and the minus sign in the last equation correspond respectively 
to outgoing and incoming photons. We can see that such geodesics play the same 
role in the Ken - geometry as do the radial geodesics in the Schwarzschild case, in 
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that the radial coordinate is described at a uniform rate with respect to the affine 
parameter. 

Choosing r = +ck for outgoing photons, we find that 


dt 

dr 


it (r 2 + a 2 ) 


dcj) 

dr 


cA dr r A 

Using the fact that A > 0 in region I, it follows that dr/dt > 0 in region I, thus 
confirming that these equations correspond to outgoing photons. If we restrict our 
attention to the case a 2 < /i 2 . the equations can be immediately integrated to give 


ct = r+\jx + 


/x- 


yfW- 

+ constant, 

In 


In 


-1 

r + 


+ I /x — 
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-1 
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2y/ /X 2 


+ constant. 


(13.60) 

(13.61) 


The solution corresponding to incoming photons is obtained by choosing r = —ck 
and has the same form as above but with t—^—t and <b —(b. In Figure 13.6 we 
plot an incoming principal null geodesic in the (ct, r)-plane and in the (x, y)-plane 
(with x = V r 2 + a 2 cos (b and y = V r- + a 1 sin c p) for a photon that passes through 



Figure 13.6 A principal null geodesic in a Kerr geometry with rotation param¬ 
eter a — 0.8/x. The trajectory (solid line) is plotted in the (ct, r)-plane (left) 
and in the (x, v)-plane (right); x — y/r 2 + a 2 cos </> and y = Vr 2 + a 2 sin</>. The 
locations of the horizons (broken lines) and the ring singularity (dotted line) are 
also indicated. 
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the point (r, (b) = (8/x, 0) at t = 0 in a Ken - geometry with rotation parameter 
a = 0.8 (i. 

We note that, in the limit a 0, (13.60) reduces to the equation for a radial 
photon trajectory in the Schwarzschild geometry, as presented in Section 11.3. 
Indeed, the null geodesics considered above play the same role as the null radial 
geodesics in the Schwarzschild case, giving information about the radial variation 
of the light-cone structure. We can draw a spacetime diagram of the light-cone 
structure using these equations and we find in region I a diagram analogous to that 
obtained for the Schwarzschild geometry in (/, r, 0, <b) coordinates in Chapter 11: 
the light-cones narrow down as r —»■ r + . On r = r + both t and <b become infinite, 
again indicating that this is a coordinate singularity. 


13.17 Equatorial circular motion of photons 


For circular photon motion we require r = 0 and, for the photon to remain in a 
circular orbit, the radial acceleration r must also vanish. In terms of the effective 
potential defined in (13.59), for a circular orbit at r = r c we thus require 


K-ff ( h:' ^0 


1 

Iri 


and 


dr 


= 0 . 


(13.62) 


These two equations determine a single value r = r c (different for prograde and 
retrograde orbits) for which there exists a circular orbit, and the corresponding 
value of the constant b. 

Using the expression (13.59), the above conditions yield respectively 

= (13.63) 

b + a 

(b + a) 3 = 21jji 2 {b-a). (13.64) 


These equations may be solved by setting y — b + a in (13.64), solving the 
resulting cubic equation and substituting the resulting value of b into (13.63). One 
may easily verify that the result can be written as 


r c = 2(i 


f 2 1 

/ a \ " 

- cos 

±- 

_3 

V /vJ 


b = 3^/ (ir c — a, 


(13.65) 

(13.66) 


where the upper sign in (13.65) corresponds to retrograde orbits and the lower 
sign to prograde orbits. In the limit a-rOwe recover the conditions for a circular 
photon orbit in the Schwarzschild case, obtained in Chapter 9, namely r c = 3/i 
and b = 3\/3/i. As in the Schwarzschild case, circular photon orbits in the Kerr 
geometry arc unstable. 
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13.18 Stability of equatorial photon orbits 


In our discussion of the stability of photon orbits in the Schwarzschild geometry, 
it was useful to consider the effective potential for photon motion. As mentioned 
above, however, when a ^ 0 one must be careful in interpreting (13.59) as an 
effective potential since it depends on the value b (and hence k ) of the particle 
trajectory. Nevertheless, we can still investigate the stability of the photon orbits 
by factorising the energy equation (13.57). 5 One finds that 


(r 2 + a 2 ) 2 — a 2 A r 


c 2 k 2 - 


4pra 


-ckh — 


{r 2 + a 2 ) 2 — a 2 A (r 2 + a 2 ) 2 — a 2 A 


(r 2 + a 2 ) 2 — a 2 A 


[c 2 k — V + (r)] [c 2 k — V_(r)], 


where V±{r) do not depend on k and arc given by 
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(13.67) 


(13.68) 


The first property to notice is that if A < 0 then the functions V±(r) arc complex 
and so there arc no (real) solutions to the equation r = 0. This shows that the 
photon orbit has no turning points. Thus once a photon crosses the surface A = 0, 
it cannot turn around and return back across the surface. Therefore A = 0 defines 
an event horizon in the equatorial plane (in fact, as we showed earlier, that A = 0 
defines the event horizon is true in the general case). 

The qualitative features of photon trajectories may be deduced by plotting the 
functions V±(r). We choose first the case ah > 0 (angular momentum in the 
same sense as the rotation of the source) and confine our attention to r > r + (i.e. 
outside the outer horizon). The curves arc plotted in Figure 13.7. It is clear from 
(13.67) that photon propagation is only possible if c 2 k > V + or c 2 k < V_, since 
we require r 2 > 0. Thus, at any given coordinate radius r, photon propagation 
cannot occur if c 2 k has a value lying in the region between the curves V_ ( r) 
and V + (r). However, we must also remember, from (13.38), that c 2 k = p t , the 
covaliant time component of the photon’s 4-momentum. This is the energy of 
the photon relative to a fixed observer at infinity. We arc used to the idea of 
‘positive-energy’ photons with p, > 0. They may come in from infinity and either 
reach a minimum r or plunge into the black hole, depending on whether they 
encounter the hump in V + (r). 

What about photons for which p t < V_1 Some of these have p, > 0 but others 
have p t < 0. Which photons of these types, if any, can actually exist? Near the 


5 This approach is based on that presented in B. Schutz, A First Course in General Relativity, Cambridge 
University Press, 1985. 
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r=r + r=r s+ 


Figure 13.7 The factored effective-potential diagram for equatorial photon 
orbits with positive angular momentum {ah > 0). The quantity u> _ is the value 
of co at r — r + . Photon propagation is forbidden in the shaded region. 

horizon in the Kerr metric, the ‘energy’ p t relative to an observer at infinity has no 
obvious physical meaning. The important requirement is that to an observer near 
the horizon the photon has a positive energy. A convenient observer, although 
any would suffice, is one who resides at fixed r in the equatorial plane, circling 
the hole with a fixed angular velocity il (this observer is not on a geodesic, so 
would need to be in a spaceship). As discussed in Section 13.8, the observer’s 
4-velocity u in (t, r, 0, f>) coordinates has components 

K] = w f (1,0,0, O). 

Thus, he measures a photon energy 

E —pu = p,u r + Pfiu't’ = u\p t — flh). 

The photon must therefore have p t > ilh. Since p t = c 2 k , we thus require 

c 2 k > (Ih. (13.69) 

From our discussion in Section 13.8 about observers in the ergoregion, we know 
that O is restricted to lie in the range O < <1 < fl + , where Ll ± are given by 
(13.28). Comparing (13.28) with (13.68) we see that, remarkably, V± = £l±h. 
Thus, any photon with c 2 k > V + also satisfies the condition (13.69) and so is 
allowed, while any photon with c 2 k < V_ violates (13.69) and is forbidden. We 
conclude that here there is nothing qualitatively different from our discussion of 
photon orbits in the Schwarzschild geometry. 

For photons moving in the opposite direction to the hole’s rotation (ah < 0), 
new features do appear. If ah < 0 it is clear from (13.68) that the shapes of the 
V ± (r) curves are simply turned upside down (see Figures 13.8 and 13.7). From 
(13.67) directly, we again see that in the region between the curves V_(r) and 
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r=r + r=r s+ 


Figure 13.8 The factored effective-potential diagram for equatorial photon 
orbits with negative angular momentum (ah < 0). The quantity a> + is the value 
of a> at r = r + . Photon propagation is forbidden in the shaded region. The Penrose 
process is also illustrated (see the text for details). 


V + (r) there is no photon propagation. Moreover, the condition (13.69) means 
that photons must have c 2 k > V + but from Figure 13.8 we see that in the region 
r < r s + (the ergoregion) some of these photons can have c 2 k < 0. We can now 
understand in an alternative manner an idealised version of the Penrose process, 
discussed in Section 13.9. At some point between r + and r s + it is allowable to 
create two photons, one having p t = E and the other having p r — —E, so that 
their total energy is zero. Then the ‘positive-energy’ photon could be directed in 
such a way as to leave the hole and reach infinity, while the ‘negative-energy’ 
photon is necessarily trapped and inevitably crosses the horizon. The net effect is 
that the positive-energy photon will leave the hole, carrying its energy to infinity. 
Thus energy has been extracted. 


13.19 Eddington-Finkelstein coordinates 

We have seen throughout our discussion of the Kerr geometry that the Boyer- 
Lindquist coordinates t and 4> are ‘bad’ in the region near the horizons. By analogy 
with our discussion of removing the coordinate singularity in the Schwarzschild 
geometry, we may use the equations for the principal photon geodesics, (13.60) 
and (13.61), to obtain a coordinate transformation that extends the solution through 
r = r + . 

Working with these equations in differential form, we have 

r 2 + a 2 

cdt = - dr, (13.70) 

A 


d(f> =- dr, 


(13.71) 
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for the ingoing photons. For advanced Eddington-Finkelstein coordinates 
if, 4>', r, 8) we want ingoing principal photon trajectories to be straight lines. 
Thus, for such a trajectory, we require 

cdt' = —dr and dd = dcj)' = 0. 


From (13.70) and (13.71), we see immediately that the required transformations are 

, , , 2fir 

c dt = c dt-\ - dr, 

A 

a 

deb = clcb H- dr. 

A 

The Kerr solution in advanced Eddington-Finkelstein coordinates then takes the 
form 


ds 2 = 11 — 


2 fir 


t n 4ur , 
c dt -—cc?r dr 


1 + 


2 jir\ 9 Afira sin“ 


dr~ + 


-cdt' dcj)' 


2(r 2 + cr)a sin 2 9 , 9 9 

+ ’ - dr dcj)' - p 2 dd 2 


(r 2 + n 2 )sin 2 6 + 


2p,ra 2 sin 4 6 


dcj)'’ 


If we define the advanced time parameter p = ct' + r (such that dp = 0 along the 
photon geodesic), the Kerr solution can also be written as 


, , 2 ar , 

ds = I 1-=- I dp~ — 2dpdr + 


4/xrasin" 6 


dpd(j)' + 2a sin 2 d dr dcj)' — p 2 dd 2 


(r 2 + a 2 ) sin 2 6 + 


2pro 2 sin 4 8 


One may alternatively straighten the outgoing photon geodesics by introducing 
retarded Eddington-Finkelstein coordinates (t *, 4 >*, r, 9) and the retarded time 
parameter q — ct* — r, in an analogous manner. 

Figure 13.9 shows a spacetime diagram along the equator of a Ken - black hole 
using advanced Eddington-Finkelstein coordinates. As in the Schwarzschild solu¬ 
tion, the event horizon at r + marks a surface of ‘no return’. Once a particle has 
crossed the event horizon, its future is directed towards region III, which contains 
the singularity - you can never return back to region I. Unlike the Schwarzschild 
solution, the singularity in the Kerr solution is timelike (the singularity in the 
Schwarzschild solution is spacelike). In theory, this means that it is possible to 
avoid the singularity by moving along a timelike path; in other words, if we 
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Event horizon Event horizon 



Figure 13.9 spacetime diagram of the Kerr solution in advanced Eddington- 
Finkelstein coordinates. 

were in a spaceship (and ignoring the intense tidal forces which would make this 
experiment impractical) we could manoeuvre along a path to avoid the singularity. 
Indeed, by performing a maximal extension of the Ken - geometry in an analo¬ 
gous way to the Kruskal extension of the Schwarzschild geometry described in 
Chapter 9, one finds that a particle may re-cross the surface r = r_ and eventually 
emerge from r = r + into a different asymptotically flat spacetime (in an analogous 
way to that described for the Reissner-Nordstrom geometry in Section 12.6). 
However, you should not take the internal structure of the Ken - solution too seri¬ 
ously. As mentioned above, region III also contains closed timelike curves (at 
r < 0), which are very bad news because they violate causality. Most theorists 
would hope that quantum gravity comes to the rescue and prevents causality 
violation. At present we do not really know what happens within region III. 

Figure 13.10 shows a schematic illustration of the light-cone structure in the 
equatorial plane of the Kerr solution, which also illustrates the frame-dragging 
effect. As we approach the infinite redshift surface S + , any particle travelling 
against the direction of rotation has to travel at the speed of light just to remain 
stationary (relative to a fixed observer at infinity). At smaller r, in the ergoregion, 
the light-cones are tipped over, so that photons (and massive particles) are forced 
to travel in the direction of rotation. At the event horizon r + , the lightcones tip 
over so far that the future is directed towards region II. 
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Figure 13.10 Frame dragging in the equatorial plane of the Kerr solution. 


13.20 The slow-rotation limit and gyroscope precession 

Since the full Kerr solution is rather complicated, it is useful to consider the 
simpler approximate form for the common limiting case of a slowly rotating body. 
Thus, we will only keep terms in the Kerr metric to first order in a. Writing the 
resulting metric in terms of the angular momentum / = Mac of the rotating body, 
in Boyer-Lindquist coordinates we obtain 


ds 2 


= ds 


2 

Schwarzschild 


+ 


4 GJ , 

~ sin - Odcpdt, 
c 1 r 


(13.72) 


where the first term on the right-hand side is the standard Schwarzschild 
line element. In the slow-rotation limit, Boyer-Lindquist coordinates tend to 
Schwarzschild coordinates. This metric is useful for performing calculations of, 
for example, the general-relativistic effects due to the rotation of the Earth. In 
fact, for terrestrial applications, and many other astrophysical situations, we may 
also assume the gravitational field to be weak, in which case the line element 
becomes 


ds 1 = c l 


2 GM\ 

‘-H 


dr 


c^r 

,2 


1 + 


2 GM 


\GJ , 

H—— sin" Odcpdt. 

c 1 r 


(dr 2 + r 2 dO 2 + r 2 sin 2 0 dcj) 2 ) 

(13.73) 


It is often also convenient to work in Cartesian coordinates defined by 


x = r sin 6cos 0, y = rsin0sin^>, y = rcosd, 
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for which the line element is easily shown to take the form 


ds~ — c I 1 — 


2 GM 


) clt 1 ~ 


+ : 


2 GM 


( dx 2 + dy 1 + dz 2 ) 


AGJ 

H— y—z(x dy — y dx) dt , 
c~r i 


(13.74) 


where r is now defined by r — Jx 1 +y 2 + z 2 . 

To illustrate the usefulness of the slow-rotation limit, we now consider the 
precession of a gyroscope induced by the frame-dragging effect of a slowly rotat¬ 
ing body, such as the Earth. As discussed in Chapter 10, in general a gyroscope 
in orbit around a massive non-rotating body will precess simply as a result of 
the spacetime curvature induced by the massive body (geodesic precession). If 
the central body is also rotating then there is an additional precessional effect, 
which we now discuss. Let us consider the thought experiment shown schemati¬ 
cally in Figure 13.11. A gyroscope falls freely down the rotation axis of a slowly 
rotating body. Initially the spin axis is oriented perpendicular to the rotation axis. 
By symmetry, if the body were not rotating then the spin axis would remain 
fixed with respect to infinity (e.g. pointing constantly to one distant star), thus 
for this particular - orbit there is no geodesic precession of the gyroscope. By this 
measure, the local inertial frames on the axis are not rotating with respect to 
infinity. However, if instead the body were rotating, even with a small angular 
momentum, then the gyroscope would precess, indicating that the local inertial 
frames are rotating with respect to infinity. 


z 


dx J 



Figure 13.11 A gyroscope (solid circle) falling down the rotation axis of a 
spinning body. 
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We can use the metric (13.74) to calculate the precession rate of the gyro¬ 
scope on its downward ‘polar plunge’ trajectory. As shown in Section 10.5, the 
spin 4-vector s(r) is parallel-transported along the geodesic trajectory. Thus, its 
components satisfy 

ds^ 

—+ r%^v r = o. (13.75) 

<7T 

For the physical arrangement under consideration, the initial 4-velocity u and the 
spin 4-vector s of the gyroscope in Cartesian coordinates have the forms 

[w M ] = (w f , 0, 0, u z ) and [s M ] = (0, s x , s y , 0). 


Moreover, these forms remain valid at all later times, since the trajectory is a 
polar plunge and s u is conserved along it. Thus, in (13.75), the only equations 
we need to consider arc 
ds x 

——h r x xt s x u> + T x xz s x u z + T x yt s y u' + T x yz s y u z = 0, (13.76) 

ds y 

— + T y xt s x u‘ + r y xz s x u z + r y yt s y u' + r y yz s y u z = 0. (13.77) 

To continue with our calculation, we must first find the connection coeffi¬ 
cients appealing in the above equations. This is most easily achieved using the 
‘Lagrangian’ approach, writing down the Euler-Lagrange equation for x and 
remembering that on the polar axis all terms proportional to some positive power 
of x or y arc zero and r = z. One finds that the only non-zero connection coeffi¬ 
cients of the form T* are 

(PM and (PM 2GM 

1 Tp-axis C 2 Z 3 l *z/z-axis c 2 z(z + 2GM/c 2 )' 


By considering the symmetry properties of the metric (13.74), one immediately 
deduces that, on the polar axis, the only non-zero connection coefficients of the 
form arc 


*f)z-axis — 


2GJ 


and 


(P 


rz/z-axis 


2GM 

c 2 z(z + 2GM/c 2 ) 


These connection coefficients can now be substituted into equations (13.76, 
13.77), which can be solved once u 1 and u z have been determined from the 
geodesic equations. Assuming, however, that the speed of the falling gyroscope 
is non-relativistic, then to leading order in 1 /c we may take u l ~ 1 and u z ~ 0. 
Thus, in this approximation, equations (13.76, 13.77) reduce to 


ds x _ 2GJ y 

dr c 2 z 3 


ds y _ 2GJ x 

dr c 2 z 3 


and 
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Hence, as it falls, the gyroscope precesses in the same direction as the body is 
rotating, i.e. the local inertial frames arc dragged with respect to infinity. This is 
called the Lens-Thirring effect. At a height z the rate of precession is 


Ht t — 


2 GJ 


It should be noted that we have calculated this precession rate in a Cartesian 
coordinate system in which the centre of the gravitating body is at rest and 
the gyroscope is falling. Fortunately, an observer free-falling with the gyroscope 
would measure the same precession rate since the Lorentz transformation that 
connects the two frames is a boost along the z-axis, which does not affect the 
transverse components s x and s y of the spin vector. Of course, the Lens-Thirring 
effect also results in the precession of gyroscopes following trajectories other 
than the polar plunge considered here, but determining the rate of precession in 
general requires a considerably longer calculation (see Exercise 17.26). 


Exercises 

13.1 Verify that the Boyer-Lindquist form of the Kerr metric satisfies the empty-space 
Einstein field equations. 

Note: This exercise is only for the truly dedicated reader! 

13.2 Show that the Boyer-Lindquist form of the Kerr metric can be written in the forms 
(13.12) and (13.13). 

13.3 Calculate the contravariant components of the Kerr metric in Boyer-Lindquist 
coordinates. 

13.4 Show that, in the limit /r —> 0, the Kerr metric tends to the Minkowksi metric. 

13.5 Show that the Kerr-Schild form of the Kerr metric can be transformed into the 
Boyer-Lindquist form by the coordinate transformations (13.16-13.19). 

13.6 Consider the 2-surfaces defined by t — constant and r — r ± in the Kerr geometry. 
Show that, for each surface, the circumference around the ‘poles’ is less than the 
circumference around the equator. Show that the same is true for the 2-surfaces 
defined by t = constant and r — r s± . 

13.7 Show that the proper area of the event horizon r — r_ in the Kerr geometry is 
given by 


A — 477 (rf + a 2 ). 


Hence show that, for fixed /jl, the area A is a maximum for a = 0. Conversely, for 
fixed A, show that /jl is a minimum for a = 0. Comment on your results. 
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13.8 An observer is at fixed (r, 0) coordinates in the ergoregion of a Kerr black hole 
and has angular velocity El — d<p/dt with respect to a second observer at rest at 
infinity. Show that the allowed range for El is given by El_ < ft < Q f , where 

/ A \ 1/2 

n ± = w±c (^J 


and oj, A and £ 2 have their usual meanings in the Kerr metric. 

13.9 Use your answer to Exercise 13.7 to show that the area of the event horizon r — r + 
in the Kerr geometry may be written as 


A 


817 G 



where M and J are the mass and angular momentum of the black hole. Hence 
show that if the mass and angular momentum change by SM and SJ respectively 
then the corresponding change in the proper area of the horizon is given by 


8A = 


877 G a 
c { l HA //r 2 -fl 2 


8M-El H ^ 


where fl H is the ‘angular velocity of the horizon’, defined in (13.29). Thus show 
that the area of the event horizon must increase in the Penrose process. 

13.10 Show that, in the equatorial plane 6 — 77/2 of the Kerr geometry, the contravariant 
metric components in Boyer-Lindquist coordinates are 


r +fl“ + 


2/juci 2 


/f/) = 2fia 
’ crA 


g n ' = 




13.11 Show that the geodesic equations for particle motion in the equatorial plane of the 
Kerr geometry may be written in Boyer-Lindquist coordinates as 


? A 


r + a" + 
2 fxac 


2/j.ci 2 


k-^k 


t+i 1- —1/1 


f 2 = ( c 2 k 2 - e 2 ) - 


2 € 2 /jl a 2 (c 2 k 2 — e 2 ) — h 2 2fi(h — ack) 2 


where e 2 = c 2 for a massive particle and e — 0 for a photon. Verify that these 
equations reduce to the Schwarzschild case in the limit a -> 0. 

13.12 The trajectory of an infalling particle of mass m in the equatorial plane of a 
Kerr black hole is characterised by the usual parameters k and h. If the particle 



352 


The Kerr geometry 


eventually falls into the black hole, show that the mass and the angular momentum 
of the hole are changed in such a way that 

M->M + kmc 2 , J—>J+mh. 

Show further that the corresponding change 8a in the rotation parameter of the 
black hole is given by 

m 

8a — - (h — ack). 

cM v 7 

If the particle falls into an extreme Kerr black hole, for which a = /jl, show that a 
naked singularity would be created if 

h > 2ck/JL. 

However, by determining the maximum value of the effective potential V eff (r; /;, k) 
defined in (13.46) for a = /jl, show that a particle with h > 2ck/j. can never fall 
into the black hole. 

13.13 For a Kerr black hole, using the Boyer-Lindquist coordinates show that, for a 
particle in circular orbit in the 6 — 77/2 plane, the coordinate angular velocity 
11 = d(f>/dt satisfies 

CUL 1/2 

n = — — -. 

an'/ 2 ±r 3/ 2 

This is the Kerr-metric analogue to H 2 = GM/r 3 for the Schwarzschild metric. 
Here the plus sign corresponds to prograde orbits, the minus to retrograde orbits. 

13.14 If a particle’s motion is initially in the 9 — 77/2 plane in a Kerr metric, show that 
the motion will remain in this plane. 

13.15 Show that the values of the parameters k and h for a circular orbit of coordinate 
radius r — r c , given in (13.54) and (13.55) respectively, satisfy the requirements 

Veff (r c ; h, k) = \c 2 (k 2 — 1) and —^ =0. 


Show further that for the orbit to be stable one requires 
r 2 — 6 fjbr c — 3 a 2 8 a^/iJtr c = 0. 


13.16 An observer (not necessarily free-falling) orbits a Kerr black hole in the equatorial 

plane in a circular orbit. His 'angular velocity with respect to a distant observer’ 
is fl — d(j)/dt. Find the components u‘, , u , and u ^ in terms of 11, r, /jl and a. 

13.17 Suppose that the circular orbit considered in Exercise 13.16 lies outside the horizon 
r + but inside the stationary limit r s . Show that under these circumstances 11 must 
be non-zero, i.e. the observer cannot remain at rest relative to a distant observer. 
If the orbiting observer is in the region r_ < r < r + , show that the orbit cannot be 
circular. 
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13.18 Show that the effective potential for photon orbits in the equatorial plane of the 
Kerr geometry is given by 


VeffO; b) = 


1 

r 2 




13.19 For a circular photon orbit of coordinate radius r = r c in the Kerr geometry, 
show that 

r c = 2/x 11 + cos 
b — 3 *Jnr c — a. 



where the upper sign in the first equation corresponds to retrograde orbits and the 
lower sign to prograde orbits. Hence show that, for an extreme Kerr black hole 
( a — /jl), r c = 4/r for a retrograde orbit and r c — /jl for a prograde orbit. 

13.20 For a photon orbit in the equatorial plane of the Kerr geometry, show that 


(r 2 + a 2 ) 2 


[c 2 k-V + (r)][c 2 k-V_(r)], 


where 


V±(r) 


2 /rra ± r 2 AF 2 
(r 2 + a 2 ) 2 — a 2 A 


u> zb 


1/2-1 


h. 


13.21 The general axisymmetric stationary metric can be written in the form 


ds 2 — Adt 2 — B(d4> — <w dt) 1 — C dr 2 — DdO 2 , 


where A, B, C, D and a> are functions only of the coordinates r and 6. Alice 
is an astronaut in a powered spaceship that maintains fixed (r, (b) coordinates in 
the equatorial plane 6 = 7r/2 (at a position for which g tt > 0). She simultaneously 
emits two photons in opposite tangential directions in the equatorial plane and uses 
a prearranged system of mirrors to cause each photon to move along a circular 
(non-geodesic) path of constant r. Show that the coordinate angular velocities of 
the two photons are given by 


deb , 
— — co± 
dt 



Hence show that the two photons do not arrive back with Alice simultaneously 
but are separated by a time interval 

4tt(oB 

T ~ c(A-Ba) 2 ) 1 / 2 ’ 

as measured by Alice’s on-board clock. Comment on the physical significance of 
this result. 
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13.22 Bob is in a powered spaceship following a circular orbit r — constant in the 
equatorial plane of the geometry in Exercise 13.21. His angular velocity is such 
that the component u^ of his 4-velocity is zero. Using the same arrangement of 
mirrors as in Exercise 13.21, he performs an experiment similar to Alice’s. Show 
that for Bob the two photons arrive back to him simultaneously. 

13.23 Which, if any, of the photons considered in Exercises 13.21 and 13.22 is redshifted 
from its original frequency on arriving back with Alice or Bob? Explain your 
reasoning. 

13.24 An isolated thin rigid spherical shell has mass M and radius R. If the shell is set 
spinning slowly, with angular momentum /, show that inertial frames within the 
shell rotate with angular velocity 

2 GJ 

Comment briefly on how this result is related to Mach’s principle. 
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We now discuss the application of general relativity to modelling the behaviour 
of the universe as a whole. In order to do this, we make some far-reaching 
assumptions, but only those consistent with our observations of the universe. As 
in our derivations of the Schwarzschild and Kerr geometries, we begin by using 
symmetry arguments to restrict the possible forms for the metric describing the 
overall spacetime geometry of the universe. 1 


14.1 The cosmological principle 

When we look up at the sky we see that the stars around us arc grouped into a 
large-density concentration - the Milky Way Galaxy. On a slightly larger scale, 
we see that our Galaxy belongs to a small group of galaxies (called the Local 
Group). Our Galaxy and our nearest large neighbour, the Andromeda galaxy, 
dominate the mass of the Local Group. On still larger scales we see that our 
Local Group sits on the outskirts of a giant supercluster of galaxies centred in the 
constellation of Virgo. Evidently, on small scales matter is distributed in a highly 
irregular way but, as we look on larger and larger scales, the matter distribution 
looks more and more uniform. In fact, we have very good evidence (particularly 
from the constancy of the temperature of the cosmic microwave background in 
different directions on the sky) that the universe is isotropic on the very largest 
scales, to high accuracy. If the universe has no preferred centre then isotropy also 
implies homogeneity. We therefore have good physical reasons to study simple 
cosmological models in which the universe is assumed to be homogeneous and 


1 For a detailed discussion, see, for example, J. N. Islam, An Introduction to Mathematical Cosmology, 
Cambridge University Press, 1992. 
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isotropic 2 . We thus assume the cosmological principle, which states that at any 
particular time the universe looks the same from all positions in space at a 
particular time and all directions in space at any point are equivalent. 


14.2 Slicing and threading spacetime 

The intuitive statement of the cosmological principle given above needs to be 
made more precise. In particular, how does one define a ‘particular time’ in general 
relativity that is valid globally, when there arc no global inertial frames? Also, 
since observers moving relative to one another will view the universe differently, 
according to which observers do we demand the universe to appeal - isotropic? 

In general relativity the concept of a ‘moment of time’ is ambiguous and is 
replaced by the notion of a three-dimensional spacelike hypersurface. To define 
a ‘time’ parameter that is valid globally, we ‘slice up' spacetime by introducing 
a series of non-intersecting spacelike hypersurfaces that are labelled by some 
parameter t. This parameter then defines a universal time in that ‘a particular 
time’ means a given spacelike hypersurface. It should be noted, however, that 
we may construct the hypersurfaces t = constant in any number of ways. In a 
general spacetime there is no preferred ‘slicing’ and hence no preferred ‘time’ 
coordinate t. 

It is useful at this point to introduce the idealised concept of fundamental 
observers, who are assumed to have no motion relative to the overall cosmological 
fluid associated with the ‘smeared-out’ motion of all the galaxies and other matter 
in the universe. A fundamental observer would, for example, measure no dipole 
moment in his observations of the cosmic microwave background radiation; an 
observer with a non-zero peculiar velocity would observe such a dipole as a 
result of the Doppler effect arising from his motion relative to the cosmological 
fluid. Adopting Weyl’s postulate, the timelike worldlines of these observers are 
assumed to form a bundle, or congruence, in spacetime that diverges from a point 
in the (finite or infinitely distant) past or converges to such a point in the future. 
These worldlines are non-intersecting, except possibly at a singular point in the 
past or future or both. Thus, there is a unique worldline passing through each 
(non-singular) spacetime point. The set of worldlines is sometimes described as 
providing threading for the spacetime. 

The hypersurfaces t = constant may now be naturally constructed in such a way 
that the 4-velocity of any fundamental observer is orthogonal to the hypersurface. 


2 It is worth noting that isotropy about every point automatically implies homogeneity. However, homogeneity 
does not necessarily imply isotropy. For example, a universe with a large-scale magnetic field that pointed 
in one direction everywhere and had the same magnitude at every point would be homogeneous but not 
isotropic. 
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Figure 14.1 Representation (with one spatial dimension suppressed) of spacelike 
hypersurfaces on which fundamental observers are assumed to lie. The worldline 
of any fundamental observer is orthogonal to any such surface. 


Thus, the surface of simultaneity of the local Lorentz frame of any such observer 
coincides locally with the hypersurface (see Figure 14.1). Each hypersurface may 
therefore be considered as the ‘meshing together’ of all the local Lorentz frames 
of the fundamental observers. 


14.3 Synchronous coordinates 

The spacelike hypersurfaces discussed above are labelled by a parameter t, which 
may be taken to be the proper time along the worldline of any fundamen¬ 
tal observer. The parameter t is then called the synchronous time coordinate. In 
addition, we may also introduce spatial coordinates (x, x 2 , x 3 ) that are constant 
along any worldline. Thus each fundamental observer has fixed (x 3 ,x 2 ,x 3 ) 
coordinates, and so the latter are called comoving coordinates. Since each hyper¬ 
surface t = constant is orthogonal to the observer’s worldline, the line element 
takes the form 


ds 2 = c 2 dt 2 — gjj dx‘ dx J (for i, j = 1,2, 3), 


(14.1) 


where the y (/ are functions of the coordinates (t, x 1 , x 2 , x 3 ). 

We may verify that the metric (14.1) does indeed incorporate the properties 
described in the previous section, as follows. Let x /x (r) be the worldline of a 
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fundamental observer, where t is the proper time along the worldline. Then, by 
construction, x /j '(t) is given by 

x° — t, x 1 = constant, x 2 = constant, x 3 = constant. (14.2) 


Since dx l = 0 along the worldline, we obtain ds = c dr = c dt and so t — r, 
thereby showing that the proper time t along the worldline is indeed equal to t. 
Thus, from (14.2), it is clear that the 4-velocity of a fundamental observer in 
comoving coordinates is 


K] = 


'dx^ 

dr 


(1,0, 0,0). 


(14.3) 


Since any vector lying in the hypersurface t = constant has the form [a 11 ] = 
(0, a 1 , a 2 , a 3 ), we see that 

g^ uf *a v = 0, 


because g 0i = 0 for i — 1,2, 3. Hence, the observer’s 4-velocity is orthogonal to 
the hypersurface, as we required. Finally, we may show that the worldline given 
by (14.2) satisfies the geodesic equation 

d 2 xi* „ dx v dx ,T 

-h -= o. 

dr 2 V<T dr dr 

Using (14.3), we see that we require only that l’ M 0() = 0. This quantity is given by 

^ 0 o = - 2 8 flv (2d 0 g 0v -d vgQO ), 

which is easily shown to be zero by using the fact that g 0 ' = 0 for i — 1,2, 3. Thus 
the worldlines x 11 ( t) are geodesics and hence can describe particles (observers) 
moving only under the influence of gravity. 


14.4 Homogeneity and isotropy of the universe 

The metric (14.1) does not yet incorporate the property that space is homogeneous 
and isotropic. Indeed this form of the metric can be used, with the help of a special 
coordinate system obtained by singling out a particular fundamental observer, 
to derive some general properties of the universe, without the assumptions of 
homogeneity and isotropy , although we will not consider such cases here. 

Let us now incorporate the postulates of homogeneity and isotropy. The former 
demands that all points on a particular spacelike hypersurface arc equivalent, 
whereas the latter demands that all directions on the hypersurface are equiv¬ 
alent for fundamental observers. The (squared) spatial separation on the same 
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hypersurface t = constant of two nearby galaxies at coordinates (x 1 , x 2 , x 3 ) and 
(x 1 + Ax 1 , x 2 + Ax 2 , x 3 + Ax 3 ) is 

da 2 = Ax'Ax 7 . 

If we consider the triangle formed by three nearby galaxies at some particular 
time t, then isotropy requires that the triangle formed by these same galaxies at 
some later time must be similar to the original triangle. Moreover, homogeneity 
requires that the magnification factor must be independent of the position of the 
triangle in the 3-space. It thus follows that time t can enter the g t j only through 
a common factor, so that the ratios of small distances are the same at all times. 
Hence the metric must take the form 


ds 2 = c 2 dt 2 — S 2 (t)hjj dx l dx', 


(14.4) 


where S(t) is a time-dependent scale factor and the /; (/ are functions of the 
coordinates (x 1 ,x 2 ,x 3 ) only. We note that it is common practice to identify 
fundamental observers loosely with individual galaxies (which arc assumed to be 
pointlike). However, since the magnification factor is independent of position, we 
must neglect the small peculiar velocities of real individual galaxies. 


14.5 The maximally symmetric 3-space 

We clearly require the 3-space spanned by the spacelike coordinates (x 1 , x 2 , x 3 ) 
to be homogeneous and isotropic. This leads us to study the maximally symmet¬ 
ric 3-space. In three dimensions, the curvature tensor has, in general, six 
independent components, each of which is a function of the coordinates. We 
therefore need to specify six functions to define the intrinsic geometric properties 
of a general three-dimensional space. Clearly, the more symmetrical the space, 
the fewer the functions needed to specify its properties. A maximally symmetric 
space is specified by just one number - the curvature K , which is independent 
of the coordinates. Such constant curvature spaces must clearly be homogeneous 
and isotropic. 

The curvature tensor of a maximally symmetric space must take a particularly 
simple form. It must clearly depend on the constant K and on the metric tensor 
gij. The simplest expression that satisfies the various symmetry properties and 
identities of R jjkl and contains just K and the metric tensor is given by 


Rijkl K(g jk gji SilSjk)‘ 


(14.5) 
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In fact, a maximally symmetric space is defined as one having a curvature tensor 
of the form (14.5). 

The Ricci tensor is given by 

Rjk = s' 1 R-ijki — Kg' 1 (gikgji — gngjk) 

= K(S l kgj i — 8\g jk ) 

= K (gjk ~ ^gjk) = ~2Kg jk . 


The curvature scalar is thus given by 

R = R k k = -2 K8\ = -6 K. 


As in our derivation of the general static isotropic metric in Section 9.1, the 
metric of an isotropic 3-space must depend only on the rotational invariants 

x-x=r 2 , dx-dx, x-dx, 

and in spherical polar coordinates (r, 6, 4>) it must take the form 

da 2 = C(r)(x • dx) 2 + D(r)(dx ■ dx) 2 

= C{r)r 2 dr 2 + D(r)(dr 2 + r 2 dd 2 + r 2 sin 2 6 dcj) 2 ). 

Following our analysis in Chapter 9, we can simplify this line element by redefin¬ 
ing the radial coordinate r 2 = r 2 D{r). Dropping the bars on the variables, the 
metric can thus be written as 

da 2 = B(r ) dr 2 + r 2 dd 2 + r 2 sin 2 6 dip 2 , 

where B(r) is an arbitrary function of r. 

We have met this line element before - it is identical to the space pari of the 
general static isotropic metric. In Chapter 9, we showed that the only non-zero 
connection coefficients arc 

r 1 dB{r) _ f_ r _rsin 2 0 

rr 2B(r) dr ’ 90 B{rf U B(r) ’ 

r fl rt = r %= = -sinflcosfl, r ^0 = cot 0 . 

The Ricci tensor is given in terms of the connection coefficients by 

Rij = djT k ik - d k r k u + r l ik T k ,j - T l ijT k lk , 
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and, after some algebra, we find that its non-zero components arc 

1 clB 

R,. r =-, 

" rB dr 

1 r dB 

Ree ~ B~ l ~2B^~d7 

R U = R ee sin2 0 

For our 3-space to be maximally symmetric, however, we must have 

R ij = -2K §i j, 



(14.6) 

(14.7) 


where A is a constant of integration. Substituting this expression into (14.7) then 
gives 

1 -A + Kr 2 = Kr 2 , 


from which we see that A = 1. Thus, we have constructed the line element for 
the maximally symmetric 3-space, which takes the form 

(14.8) 

and has a curvature tensor specified by one number, K , the curvature of the space. 

Notice also that this is exactly the same form as the metric for a 3-sphere 
embedded in four-dimensional Euclidean space, which we discussed in Chapter 2. 
The metric contains a ‘hidden symmetry’, since the origin of the radial coordinate 
is completely arbitrary. We can choose any point in this space as our origin since 
all points are equivalent. There is no centre in this space. We also note that, 
on scales small compared with the spatial curvature, the line element (14.8) is 
equivalent to that of a three-dimensional Euclidean space. 
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14.6 The Friedmann-Robertson-Walker metric 


Combining the our expression (14.8) for the maximally symmetric 3-space with 
the line element (14.4), which incorporates the cosmological principle and Weyl’s 
postulate, we obtain 

T dr 2 

ds 2 = c 2 dt 2 — S 2 (t) -» + r 2 {dd 2 + sin 2 0 d(jr) . (14.9) 

1 — Kr- 

It is usual to write this line element in an alternative form in which the 
arbitrariness in the magnitude of K is absorbed into the radial coordinate and the 
scale factor. Assuming firstly that K ^ 0 we define the variable k = K/\K\ in 
such a way that k = ± 1 depending on whether K is positive or negative. If we 
introduce the rescaled coordinate 

~r= \K\ l/2 r, 

then (14.9) becomes 

ds 2 = c 2 dt 2 — ^ ^ -—^r- + r 2 (d0 2 + sin 2 9 d<p 2 ) . 

| A'| \_l—kr- 

Finally, we define a rescaled scale function R(t) by 


S(t) 

R(t)= \K\V 2 

S(t) 


if K / 0, 
if K = 0. 


Then, dropping the bars on the radial coordinate, we obtain the standard form for 
the Friedmann-Robertson-Walker (FRW) line element, 

(14.10) 

where k takes the values —1, 0, or 1 depending on whether the spatial section 
has negative, zero or positive curvature respectively. It is also clear that the 
coordinates (r, 0, <:/)) appealing in the FRW metric are still comoving, i.e. the 
worldline of a galaxy, ignoring any peculiar velocity, has fixed values of (r, 9, <p). 



14.7 Geometric properties of the FRW metric 

The geometric properties of the homogeneous and isotropic 3-space corresponding 
to the hypersurface t = constant depend upon whether k — — 1, 0 or 1. We now 
consider each of these cases in turn. 
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Positive spatial curvature: k = 1 

In the case k = 1, we see that the coefficient of dr in the FRW metric becomes 
singular as r —> 1. We therefore introduce a new radial coordinate x- defined by 
the relation 

r = sin^ => dr — cos xdx — — r ") 1 ^ 2 dx, 

so that the spatial part of the FRW metric takes the form 

da 2 = R 2 \dx 2 + sin 2 x(d9 2 + sin 2 9 d<jr)\, 

where R is the value of the scale factor at the particular' time t defining the 
spacelike hypersurface of interest. 

Some insight into this spatial metric may be gained by considering the 3-space 
as embedded in a four-dimensional Euclidean space with coordinates (w, x, y, z), 
where 


w = R cos x, 
x = R sin x sin d cos c/>, 
y = R sin x sin 9 sin cf>, 
z = R sin x cos 6. 

In fact we have already discussed exactly this embedding in Section 2.9. Such an 
embedding is possible since one can write 

da 2 = dw 2 + dx 2 + dy 2 + dz 2 = R 2 \dx 2 + sin 2 x(d9 2 + sin 2 9 d</> 2 )], 
where, from the transformation equations, we have the constraint 
w 2 + x 2 +y 2 + z 2 = R 2 . 

This shows that our 3-space can be considered as a three-dimensional sphere in the 
four-dimensional Euclidean space. This hypersurface is defined by the coordinate 
ranges 

0 < X — 77 ’ 0 < 9 < TT, 0 < (/) < 277. 

The surfaces x = constant are 2-spheres with surface area 

A— [ [ (Rsinx d9)(Rsinxsio9 dcj)) = 4 ttR 2 sin 2 x, 

Je=o-l(f>=o 

and (9, (b) are the standard spherical polar coordinates of these 2-spheres. Thus, as 
X varies from 0 to 77, the area of the 2-spheres increases from zero to a maximum 
value of 4irR 2 at y = tt/2, after which it decreases to zero at x = 77. The proper 
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radius of a 2-sphere is Rx, and so the surface area is smaller than that of a sphere 
of radius Rx in Euclidean space. 

The entire 3-space has a finite total volume given by 


/» 77 a TT n 277 

/ / / (R d\)(R sin xdd)(R sin x 6 dcj)) = 2 tt 2 R 3 , 

Jy =0 JQ =0 J(b =0 


v = 


j x =q j e=o J <t>=t) 


which is the reason why, in this case, R is often referred to as the ‘radius of the 
universe’. 


Zero spatial curvature: k = 0 

In this case, if we set r — x (to keep our notation consistent), the 3-space line 
element is 

da 2 = R 2 \dx 2 + X 2 (d0 2 + sin 2 dd<t> 2 )] , 

which is simply the ordinary three-dimensional Euclidean space. As usual, under 
the transformation 

x = RxsinOcosc/), y = Rx sin 0 sin 0 , z = Rx co& 6, 
the line element becomes 

da 2 = dx 2 + dy 2 + dz 2 - 


Negative spatial curvature: k = — 1 

In this case, it is convenient to introduce a radial coordinate x given by 

r = sinh,Y =>• dr = cosh xdx = (1 + r“) 1/2 dx, 

so that the spatial paid of the FRW metric becomes 

do 2 = R 2 [dx 2 + sinh 2 x(d0 2 + sin 2 6 d(jr)\. 

We cannot embed this 3-space in a four-dimensional Euclidean space, but it can 
be embedded in a four-dimensional Minkowski space with coordinates (w, x, y, z) 
given by 

w = R cosh x, 
x = R sinh x sin 6 cos 4>, 
y = R sinh x sin 6 sin 0, 
z = R sinh x cos 6. 

In this case, we can write 

dcr 2 = dw 2 — dx 2 — dy 2 — dz 2 . 
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together with the constraint 

w 2 — x 2 — y2 — z 2 — R ~, 

which shows that the 3-space can be represented as a three-dimensional hyper¬ 
boloid in the four-dimensional Minkowski space. The hypersurface is defined by 
the coordinate ranges 

0 < ^ < oo, 0 < 9 < it, 0 <4>< 2tt. 

The 2-surfaces x — constant arc 2-spheres with surface area 
A = 4itR 2 sinh 2 x, 

which increases indefinitely as x increases. The proper radius of such a 2-sphere 
is Rx, and so the surface area is larger than the corresponding result in Euclidean 
space. The total volume of the space is infinite. 


From the above discussion, we see that a convenient form for the FRW metric is 


ds 2 = c 2 dt 2 — R 2 (t) [dx 2 + S 2 (x)(dd 2 + sin 2 9 d<jr)\, 
where the function r = S(x) is given by 


(14.11) 



[sin* 

if k—l, 

S(x)= • 

* 

if k — 0, 


[ sinh x 

if k——\. 


Once again, it is clear that (x, 9, (R) arc comoving coordinates. 


(14.12) 


14.8 Geodesics in the FRW metric 

In the comoving coordinate system(s) we have defined above, the galaxies have 
fixed spatial coordinates (by construction; any peculiar velocities are ignored). 
Thus the ‘cosmological fluid' is at rest in the comoving frame we have chosen. 
We now consider the motion of particles travelling with respect to this comoving 
frame. In particular, we consider the geodesic motion of ‘free’ particles, i.e. those 
experiencing only the ‘background’ gravitational field of the cosmological fluid 
and no other forces. Examples of such particles might include a projectile shot out 
of a galaxy or a photon travelling through intergalactic space. We could use the 
‘Fagrangian’ procedure to calculate the geodesic equations for the FRW metric, 
but instead we take advantage of the fact that the spatial part of the metric is 
homogeneous and isotropic to arrive at the equations rather more quickly. 
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It is convenient to express the FRW metric in the form (14.11) and write 
[x M ] = (t, x, 9, 4>), so that 

goo = c 2 , gn = -^ 2 ( 0 > 822 = -R 2 (t)S 2 (x), 833 = - R 2 (t)S 2 (x)sin 2 6 . 

The path of a particle is given by the geodesic equation 

where u lx = x 11 and the dot corresponds to differentation with respect to some 
affine parameter. For our present purposes, however, it will be more useful to 
rewrite the geodesic equation in the form 

2 (^lJ* 8 v<r)M U ■> 

which shows, as expected, that if the metric is independent of a particular coor¬ 
dinate x A then u A is conserved along the geodesic. 

Let us suppose that the geodesic passes through some spatial point P. Since the 
spatial paid of the metric is spatially homogeneous and isotropic we can, without 
loss of generality, take the spatial origin of the coordinate system, i.e. x = 0. to 
be at the point P. This simplifies the analysis considerably. 

Consider first the ^-component w 3 . Since the metric is independent of r/>, we 
have i/ 3 = 0 so that n 3 is constant along the geodesic. But 

«3 = S33“ 3 = - R 2 (t)S 2 (x) sin 2 0 w 3 , 

so that « 3 = 0 at the point P where x = 0- Thus w 3 = 0 along the path and so also 
we have u 3 = 0 = 0. Hence, along the geodesic, 

4> = constant. 

For the 0-component, we have 

ii 2 = I {d 2 g va )u v u a . (14.13) 

The only component of v that depends on x 2 = 9 is g 33 , but the contribution of 
the corresponding term in (14.13) vanishes since u 3 = 0. Thus it 2 = 0 and so u 2 
is constant along the geodesic. Again 

u 2 = g 22 u 2 = -R 2 (t)S 2 (x)u 2 , 

which vanishes at P(x = 0), and so u 2 is zero along the geodesic, as is u 2 , so that 

9 = constant. 


For the r-component. 


2 8 nr )^ M • 


(14.14) 
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We have ir = u 3 = 0, while g 00 and g n are independent of Thus, ii\ =0 so 
that u ] is constant along the geodesic, so u ] = g n u l must be constant. Thus, we 
have 

R 2 {t)x = constant. (14.15) 

Finally, n° can be found from the appropriate normalisation condition, 
id x = c 2 for massive particles or = 0 for photons. Thus, we have 


i 2 -. 

[i + *W 

for a massive particle, 


R 2 (t)x 2 

c 2 

for a photon. 


14.9 The cosmological redshift 


We can use the results of the last section to derive the cosmological redshift. 
Suppose that a photon is emitted at cosmic time t E by a comoving observer with 
fixed spatial coordinates (x E , 9 E , 4>e ) an d that the photon is received at time 
t R by another observer at fixed comoving coordinates. We may take the latter 
observer to be at the origin of our spatial coordinate system. 

For a photon one can choose an affine parameter such that the 4-momentum 
is p >x = x lx . From our above discussion, dd = df> = 0 along the photon geodesic, 
or equivalently p 2 = = 0, and (14.15) shows that p ] is constant along the 

geodesic. Since the photon momentum is null, we also require that g^p^Pp = 0, 
which reduces to 



1 

Wo 


(P\) 2 = 0, 


from which we find p 0 = cpjR(t). 

In Appendix 9A we showed that, for an emitter and receiver with fixed spatial 
coordinates, the frequency shift of the photon is given, in general, by 


vr_ = Po(R) gooCg) ~| 1/2 
v e Po( E ) UooW. 


(14.16) 


For the FRW metric we have g 00 = c 2 , and so we find immediately that 

(14.17) 

Thus we see that if the scale factor R(t) is increasing with cosmic time, so that the 
universe is expanding , then the photon is redshifted by an amount z. Conversely, 
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if the universe were contracting then the photon would be blueshifted. Only if the 
universe were static, so that R = constant, would there be no frequency shift. 

In fact, we may also arrive at this result directly from the FRW metric. Since 
ds = dd = dcj) = 0 along the photon path, from (14.11) we have, for an incoming 
photon, 

r'R cdt r*E 

4 m = L “*■ 

Now, if the emitter sends a second light pulse at time t E + 8t E , which is received 
at time t R + 8t R , then 

rt R +St R c dt r xe r f R cdt 

4 +st E m = ) o dx= h E Wy 

from which we see immediately that 

j-tR+StR cc [f rt E +St E c dt 

4 m = 4 Wy 

Assuming that 8t E and 8t R are small, so that R(t) can be taken as constant in 
both integrals, we have 

StR _ St E 
R(*r) R(*e) 

Considering the pulses to be the successive wavecrests of an electromagnetic 
wave, we again find that 

I . „ _ v e _ _ R^r) 

v r St E R(t E ) 


14.10 The Hubble and deceleration parameters 

In a common notation we shall write the present cosmic time, or epoch, as t 0 . Thus 
photons received today from distant galaxies are received at t 0 . If the emitting 
galaxy is nearby and emits a photon at cosmic time t, we can write t — t 0 — 8t, 
where 8t t () . Thus, let us expand the scale factor R(t) as a power series about 
the present epoch t 0 to obtain 

R(t) = R[t 0 -(t 0 -t)] 

— R(t[ o) — (to — t)R(t 0 ) + \ (t 0 — t) 2 R(t 0 ) - 

= R(t 0 ) [1 - (t 0 - t)H(to) -\(t 0 ~ t) 2 q(to)H 2 (to) (14-18) 
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where we have introduced the Hubble parameter H(t ) and the deceleration param¬ 
eter q(t). These arc given by 



(14.19) 


where the dot corresponds to differentiation with respect to cosmic time t. It 
should be noted that these definitions arc valid at any cosmic time. The present-day 
values of these parameters arc usually denoted by H 0 = H(t 0 ) and q {) = q(t 0 ). 

Using these definitions, we can write the redshift z in terms of the ‘look-back 
time' t — t 0 as 

z = ^~ 1 = o#o-oVtf—"T 1 — 1 

and, assuming that t Q — t t 0 , we have 

Z — — t)H 0 + (t 0 — t ) 2 (l + j qo) Hq-\ -. (14.20) 

Since it is the redshift that is an observable quantity, it is more useful to invert 
the above power series to obtain the look-back time t 0 — t in terms of z. Thus for 
z « 1 we have 


t 0 -t = H 0 'z - H 0 1 (1 + \q Q ) z 2 + • • • ■ 


(14.21) 


It is worth noting that, as one might expect in this approximation, the relations 
(14.20) and (14.21) depend only on the present-day values H 0 and q 0 of the Hubble 
and deceleration parameters and hence may be evaluated without knowledge of 
the complete expansion history R(t) of the universe. 

Using the Taylor expansion (14.18), we can also obtain an approximate expres¬ 
sion for the ^-coordinate of the emitting galaxy, which is given by 

/ t 0 cc lt r 

~R(J) = J t - ]~ x dt. 

Assuming once more that t 0 — t t 0 , we have 

X = cR 0 1 [( f o — t) + — t) 2 H 0 4-]. 


(14.22) 
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We may now substitute for the look-back time t 0 — t in this result using (14.21), 
to obtain an expression for the ^-coordinate of the emitting galaxy in terms of its 
redshift (assuming z 1), which reads 


X = 


RnH, 


[z~\i} + qo)z 2 + -' 


(14.23) 


Once again, in this approximation the results (14.22) and (14.23) only depend 
on the present-day values H 0 and q Q and may be evaluated without knowing the 
expansion history of the universe. 

From the FRW metric, we see that the proper distance d to the emitting galaxy 3 
at cosmic time t 0 is d = RqX- Thus, for very nearby galaxies, d ~ c(t 0 — /). 
Moreover, from (14.20), in this case z ~ (? 0 — t)H 0 . So, if we were to interpret 
the cosmological redshift as a Doppler shift due to a recession velocity v of the 
emitting galaxy, we would obtain 


v = cz = H()d, 


(14.24) 


which is approximately valid for small z. The galaxies will therefore appeal - to 
recede from us with a recession speed proportional to their distance from us. 
This is, of course, Hubble’s law, named after Edwin Hubble, who discovered 
the expansion of the universe in 1929 by comparing redshifts with distance 
measurements to nearby galaxies (derived from the period-luminosity relation of 
Cepheid variables). His results suggested a linear recession law, as in (14.24). 
This was an amazing result. It implies that the universe started off at high density 
at some finite time in the past. You will notice from (14.24) that the Hubble 
‘constant’ has the dimensions of inverse time. As we will see later, the quantity 
1 /// 0 gives the age of the universe to within a factor of order unity. It is clear 
that, in general, the Hubble parameter will vary with cosmic time t and hence 
with redshift z. By combining the expressions (14.18), (14.19) and (14.21), we 
can obtain an expression for how the Hubble parameter varies with z for small 
redshift. 


H(z ) — H 0 [l + (1 + qo)z -]. 


(14.25) 


So far, we have been considering the low-;; limit. Having introduced the Hubble 
parameter, however, we may use it to derive useful general expressions for the 


3 In order to measure the proper distance d, one would in fact have to arrange for all the 'civilisations' along 
the route to the galaxy to lay out measuring rods at the same cosmic time t 0 . This could be synchronised by, 
for example, requiring the temperature of the cosmic microwave background or the mean matter density of 
the universe to have a given value. We will discuss more practical measures of distance shortly. 
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look-back time to an emitting galaxy, and for its ^-coordinate, as functions of the 
redshift z of the received photon. In general, we have 


dz = d( 1 + z) = d 



£ 

dt — “(1 + z)H(z) dt, 


which provides a very useful relation between an interval dz in redshift and the 
corresponding interval dt in cosmic time. Thus, we can write the look-back time as 


t 


— f ' 0 — f z dz 

f ° h 1 Jo (1 + z)H(z)’ 


(14.26) 


and the galaxy’s ^-coordinate is given by 


X = 


-l 


f o cdt 

m 


c r z dz 

R^Jo H{z)' 


(14.27) 


It is clear, however, that in order to evaluate either of these integrals we must 
know how H(z ) varies with z, which requires knowledge of the evolution of the 
scale factor R{t). 


14.11 Distances in the FRW geometry 

Distance measures in an expanding universe can be confusing. For example, let us 
consider the distance to some remote galaxy. The light received from the galaxy 
was emitted when the universe was younger, because light travels at a finite 
speed c. Evidently, as we look at more distant objects, we see them as they were 
at an earlier time in the universe’s history when proper distances were smaller, 
since the universe is expanding. What, therefore, do we mean by the ‘distance’ to 
a galaxy? In fact, interpreting and calculating distances in an expanding universe 
is straightforward, but one must be clear about what is meant by ‘distance’. 

From the FRW metric 

ds 2 = c 2 dt 2 — R 2 (t) [r/y* 2 + S 2 (x)(dd 2 + sin 2 6 r/c/j 2 )], 

we can define a number of different measures of distance. The parameter x is 
a comoving coordinate that is sometimes referred to as the coordinate distance, 
whereas the proper distance to an object at some cosmic time t is d — R{t)x , but 
this cannot be measured in practice. Thus, we must look for alternative ways of 
defining the distance to an object. The two most important operationally defined 
distance measures are the luminosity distance and the angular diameter distance. 
These distance measures form the basis for observational tests of the geometry of 
the universe. 
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Luminosity distance 

In an ordinary static Euclidean universe, if a source of absolute luminosity L 
(measured in W = Js _1 ) is at a distance d then the flux that we receive (measured 
in W m~ 2 ) is F — L/{ATrd 2 ). Now suppose that we are actually in an expanding 
FRW geometry, but we know that the source has a luminosity L and we observe 
a flux F. The quantity 



(14.28) 


is called the luminosity distance of the source. This is an operational definition , 
and we must now investigate how to express it in terms of the FRW metric. 

Consider an emitting source E with a fixed comoving coordinate x relative to 
an observer O (note that, by symmetry, the emitter would assign the same value of 
X to the observer). We assume that the absolute luminosity of £ as a function of 
cosmic time is L(t) and that the photons it emits arc detected by O at cosmic time 
t 0 . Clearly, the photons must have been emitted at an earlier time t e . Assuming 
the photons to have been emitted isotropically, the radiation will be spread evenly 
over a sphere centred on E and passing through O (see Figure 14.2). The proper 
area of this sphere is 

A = 4TrR 2 (t 0 )S 2 (x)- 

However, each photon received by O is redshifted in frequency, so that 



Figure 14.2 Geometry associated with the definition of luminosity distance 
(with one spatial dimension suppressed). 
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and, moreover, the arrival rate of the photons is also reduced by the same factor. 
Thus, the observed flux at O is 

L{t e ) 1 

The luminosity distance defined above is now evaluated as 

(14.29) 

This is an important quantity, which can be used practically, but note that it 
depends on the time history of the scale factor through the dependence on X- 



Angular diameter distance 

Another important distance measure is based upon the notion of the existence of 
some standard-length ‘rods’, whose angular diameter we can observe. Suppose 
that a source has proper diameter l. Then, in Euclidean space, if it were at a 
distance d it would subtend an angular diameter A 6 = D/d. In an FRW geometry, 
we thus define the angular diameter distance to an object to be 

(14.30) 

This is again an operational definition, and we now investigate how to express it 
in terms of the FRW metric. 

Suppose we have two radial null geodesics (light paths) meeting at the observer 
at time t 0 with an angular separation A 6, having been emitted at time t e from 
a source of proper diameter £ at a fixed comoving coordinate \ (assuming, 
for simplicity, that the spatial axes are oriented so that c/r = constant along the 
photon paths); see Figure 14.3. To obtain a clearer view of the specification of 
the coordinates, we may look vertically down the worldline of O and define the 
coordinates as in Figure 14.4. From the angular paid of the FRW metric we have 

I = R(t e )S(x) M, 



so that 


d A = R(t e )S(x) = R(t 0 ) 


R(t e ) 
R(t 0 ) 


5(a) = 


R(t 0 )S(x) 

1 +z 


Thus the angular diameter distance is given by 



(14.31) 
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O E 



Figure 14.3 Geometry associated with the definition of angular diameter 
distance (with one spatial dimension suppressed). 


ho. 0- 0. 0) 



(l c , x, 0 + A 9, <i>) 


l 


h e > X. 0, (j>) 


Figure 14.4 Specification of the coordinates in the definition of angular diameter 
distance. 


This differs from the luminosity distance d L by a factor (1 +z) 2 , emphasizing 
again that ‘distance’ depends on definition. Again, because of the ^-dependence 
we need to know the time history of the scale factor R(t) to evaluate d A . 


14.12 Volumes and number densities in the FRW geometry 

The interpretation of cosmological observations often requires one to determine 
the volume of some three-dimensional region of the FRW geometry. Consider a 
comoving cosmological observer, whom we may take to be at the origin x — 0 
of our comoving coordinate system. From the FRW metric 


ds 2 = c 2 dt 2 — R 2 (t) \d\ 2 + S 2 (x)(dd 2 + sin 2 6 dtp 2 )], 
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we see that, at cosmic time t 0 , the proper volume of the region of space lying in 
the infinitesmial coordinate range x ~ y X + and subtending an infinitesmial 
solid angle dil = sin Odddcf> at the observer is 

dV 0 = (R 0 d X ) [/^S 2 (*) dCl\ = R 3 0 S 2 (x) d X dSl. 

For the interval x —>• X + ^X in the radial comoving coordinate there exists a 
corresponding interval z —»■ z + dz in the redshift of objects lying in this radial 
range (and also a corresponding cosmic time interval t —> t + dt within which the 
light observed by O at t = t f) was emitted). We may therefore write the volume 
element as 

dV o = RqS 2 (x) ^ dz dtl. 
dz 

From (14.27), however, we have 


dx_ _ c 

dz RqH(z) ’ 


and so 


dV 0 = 


cRjS 2 (x(z)) 

H(z) 


dz dVl, 


where we have made explicit that x is also a function of z. This volume element 
is illustrated in Figure 14.5. For an expanding universe, the proper volume of this 



Figure 14.5 Geometry associated with the definition of a proper volume element 
dV 0 at cosmic time t — t 0 (with one spatial dimension suppressed). 
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comoving region will be smaller at some earlier cosmic time t (which corresponds 
to some redshift z). Indeed, using (14.17), we have 


dV(z) = 


dV Q 

(1 + z) 3 


g^o-s 2 Cy(z)) 

(1 + z) 3 H(z) 


dz d£l. 


(14.32) 


The main use of the result (14.32) is in predicting the number of galaxies (of a 
certain type) that one would expect to observe in a given area of sky and redshift 
interval, and comparing that result with observations. Suppose, for example, that 
the proper number density of galaxies of a certain type at a redshift z is given by 
n(z). Using (14.32), the total number dN of such objects in the redshift interval 
z —»■ 2 + dz and in a solid angle dfl is 

( ^ .,wr , c ^ 0 52 Wz)) n(z) ,, ri1 

dN = n(z) dV(z) — -r—-—- -ydzdil. (14.33) 

H(z) (1 + z)- 5 

The above expression has been an'anged to make use of the fact that, if objects arc 
conserved (so that, once formed, galaxies arc not later destroyed), we may write 
n(z)/(l T z) 3 = n 0 , where n 0 is the present-day proper number density of such 
objects; hence the resulting expression is simplified somewhat. As an illustration, 
let us consider a population of galaxies which arc formed instantaneously at a 
redshift z — Zf, which arc not later destroyed and which have a present-day number 
density n {) . From (14.33), the total number of such objects in the whole sky is 

,, , r ,i p> S 2 (ir(z)) 

N = 4*c„„R„) o -jjfi-dz. 

Clearly, in order to evaluate this integral one requires knowledge of the expansion 
history R(t) of the universe. 


14.13 The cosmological field equations 

So far we have investigated only the geometric and kinematic consequences of the 
FRW metric. The dynamics of the spacetime geometry is characterised entirely 
by the scale factor R(t). In order to determine the function R(t), we must solve 
the gravitational field equations in the presence of matter. 

From Chapter 8 the gravitational field equations, in the presence of a non-zero 
cosmological constant, arc 

RfJLV 2 4 ~ ^SfjLV K T[JiV’ 

where k = SttG/c 4 . It is, however, more convenient to express the field equations 
in the alternative form 


Rpv = -k( T ul V ~\ T 8hv) + ^g^v 


( 14 . 34 ) 
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where T = Tjf In order to solve these equations, we clearly need a model for the 
energy-momentum tensor of the matter that fills the universe. For simplicity, we 
shall grossly idealise the universe and model the matter by a simple macroscopic 
fluid, devoid of shear-viscous, bulk-viscous and heat-conductive properties. Thus 
we assume a perfect fluid, which is characterised at each point by its proper density 
p and the pressure p in the instantaneous rest frame. The energy-momentum 
tensor is given by 

T llv = (p+^)u' l u v -pg* v . (14.35) 


Since we arc seeking solutions for a homogeneous and isotropic universe, the 
density p and pressure p must be functions of cosmic time t alone. 

We may perform the calculation in any coordinate system, but the algebra is 
simplified slightly by adopting the comoving coordinates [ x 11 ] = (t, r, 6, f>), in 
which the FRW metric takes the form 


ds 2 = c 2 dt 2 — R 2 (t) 


' dr 2 
1 — kr 2 


+ r 2 (dd 2 + sin 2 ddf 2 ) 


Thus the covariant components g l±l , of the metric are 

5oo = c 2 , gii = - /I ^ 2 ’ 522 = -R 2 (t)r 2 , g 33 = -R 2 (t)r 2 sin 2 0. 

Since the metric is diagonal, the contravariant components g llv arc simply the 
reciprocals of the covariant components. 

The connection is given in terms of the metric by 

TV = \g ap (dvgpp + d pSpv - dpgpv). 


from which it is straightforward to show that the only non-zero coefficients arc 


' n = RR/[c(l-kr 2 )], 

r° 22 = RRr 2 /c, 

r ° 33 = (RRr 2 sin 2 0)/c, 

oi = C R/R’ 

T l n =kr/(l-kr 2 ), 

r°33 = (RRr 2 sin 2 6)/c, 

33 = — r(l — kr 2 ) sin 2 d, 



02 = cR/R , 

II 

j~r 

r 2 33 = sin 6 cos 6, 

03 = cR/R, 

r 3 i 3 = l/r. 

T 3 23 = cot 6, 


where the dots denote differentiation with respect to cosmic time t. We next 
substitute these expressions for the connection coefficients into the expression for 
the Ricci tensor, 
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After some tedious but straightforward algebra, we find that the off-diagonal 
components of the Ricci tensor arc zero and the diagonal components arc given by 


*oo — 3/?/ R, 

R n = — (RR + 2R 1 + 2c 2 k)c~ 2 /(l — kr 2 ), 

R 22 = -(RR + 2R 2 + 2c 2 k)c~ 2 r 2 , 

/?33 = —(RR + 2R 2 + 2c 2 k)c~ 2 r 2 sin 2 6. 

We must now turn our attention to the right-hand side of the field equa¬ 
tions (14.34). In our comoving coordinate system (t, r, 6, 4>), the 4-velocity of the 
fluid is simply 

[«**] = (1,0, 0,0), 


which we can write as u 11 = Sq. Thus the covariant components of the 
4-velocity arc 


U/i = g^ s o = gf ,o = , 

so we can write the energy-momentum tensor (14.35) as 


T uv = (pc 2 + p)c 2 d°S° 


■ Pg 


flV 


Moreover, since = c 2 , contraction of the energy-momentum tensor gives 


T = T £ = (p + J) c 2 - P S £ = Pc 2 - 3P- 

Hence we can write the terms on the right-hand side of the field equations (14.34) 
that depend on the energy-momentum as 

T^v ~ \Tg ]lv = (pc 2 + ;^)c 2 5° 5‘! - l 2 (pc 2 -p) g/lv . 

Including the cosmological-constant term, we find that the right-hand side of the 
field equations (14.34) vanishes for p, ^ n. The non-zero components read 


~ k (Too~ 5 *<?oo) + *£oo — — \«{P c2 + 3p)c" + Ac 2 , 

—K (*n — \Tgn) + Agu = - [\k( P c 2 -p) + A] R 2 /(l-kr 2 ), 
-K(T 12 -\Tg 22 ) + Kg 22 = -[\k( P c 2 -p) + A] R 2 r 2 , 
-K(T 33 -\_Tg 33 ) + Ag 33 = -[^K(pc 2 -p) + A] R 2 r 2 sin 2 6. 


Combining these expressions with those for the components of the Ricci tensor, 
we see that the three spatial field equations are equivalent, which is essentially 
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due to the homogeneity and isotropy of the FRW metric. Thus the gravitational 
field equations yield just the two independent equations, 


3 R/R = — ^K(pc 2 + 3p)c 2 + Ac 2 , 

RR + 2R 2 + 2c 1 k = \\ K(pc 2 — p) + A] c 2 R 2 . 

Eliminating R from the second equation and remembering that k = SttG/c 4 , we 
finally arrive at the cosmological field equations 



(14.36) 


These two differential equations determine the time evolution of the scale factor 
R(t) and arc known as the Friedmann-Lemaitre equations. In the case A = 0 they 
are often called simply the Friedmann equations. We will discuss the solutions 
to these equations in various cases in Chapter 15. 


14.14 Equation of motion for the cosmological fluid 

For any particular model of the universe, the two cosmological field equa¬ 
tions (14.36) arc sufficient to determine R(t). Nevertheless, we can derive one 
further important equation (which is often useful in shortening calculations) from 
the fact that energy-momentum conservation requires 

^ 7 ^ = 0 . 

From our discussion of a perfect fluid in Chapter 8, we know that this requirement 
leads to the relativistic equations of continuity and motion for the cosmological 
fluid. These equations read 

V>^) + ^VX = 0, (14.37) 

(p+ f) “'V = {<r- v- < 14 - 38 > 

The second equation is easily shown to be satisfied identically, since both sides 
equal zero. This confirms that the fluid particles (galaxies) follow geodesics, 
which was to be expected since p is a function of t alone, and so there is no 
pressure gradient to push them off geodesics. The continuity equation (14.37) can 
be written 


(4 (U /?)m /x + (p + (d/xU^ + T— 0. 
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Remembering that p is a function of t alone, and with id 1 = Sq, this reduces to 




(14.39) 


which expresses energy conservation. This equation can in fact be derived directly 
from the field equations (14.36) by eliminating R. Thus, only two of the three 
equations (14.36) and (14.39) arc independent. One may use whichever two 
equations arc most convenient in any particular calculation. 

Equation (14.39) can be simply rearranged into the useful alternative form 


d(pR 3 ) _ 3 pRR 2 

dt c 2 


(14.40) 


Moreover, by transforming the derivative with respect to l to a derivative with 
respect to R, one obtains a third useful form of the equation, namely 


d(pR 3 ) 3 pR 2 

dR c 2 


(14.41) 


Finally, we note that the density and pressure of a fluid arc related by its 
equation of state. In cosmology, it is usual to assume that (each component of) 
the cosmological fluid has an equation of state of the form 


P = 


wpc 2 , 


where the equation-of-state parameter w is a constant (in the more exotic cosmo¬ 
logical models one sometimes allows w to be a function of cosmic time t, but we 
shall not consider such models here). The energy equation (14.41) can then be 
written 


d( P R 3 ) 
dR 


— 3wpR 2 . 


This equation has the immediate solution 


p on R 3(1+u ’l, 


(14.42) 


which gives the evolution of the density p as a function of the scale factor R(t). 
Note that in general pc 2 is the energy density of the fluid. In particular w — 0 
for pressureless ‘dust’, w = 4 for radiation and w = — I for the vacuum (if the 
cosmological constant A ^ 0; see Section 8.7). 
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14.15 Multiple-component cosmological fluid 

Suppose that the cosmological fluid in fact consists of several distinct components 
(for example, matter, radiation and the vacuum) that do not interact except through 
their mutual gravitation. Let us suppose further that each component can be 
modelled as a perfect fluid, as discussed above. 

The energy-momentum tensor of a multiple-component fluid is given simply by 




where i labels the various fluid components. Since each component is modelled 
as a perfect fluid, we have 

T^ = Y. i [(pi+ J f)M-piir] 

= E, {pi + ~) U ' J ~ uV ~ (EiPi)8iiv 

Thus, the multicomponent fluid can itself be modelled as a single perfect fluid with 


P = E, Pi and P = J2iPb 


(14.43) 


which can be substituted directly into our cosmological field equations (14.36). 4 

Moreover, since we arc assuming that the fluid components arc non-interacting, 
conservation of energy and momentum requires that the condition 


v,(n=o 


holds separately for each component. Then each fluid will obey an energy equation 
of the form (14.39). Thus, if w, = p,/(p I c 2 ) then the density of each fluid evolves 
independently of the other components as 


Pi or R 


— 3(1+W;) 


(14.44) 


Exercises 

14.1 In an /V-dimensional manifold, consider the tensor 

R ijkt = K (8ik8ji-8iigjk)’ 

where K may be a function of position. Show that this tensor satisfies the symmetry 
properties and the cyclic identity of the curvature tensor. Show that, in order to 
satisfy the Bianchi identity, one requires K to be constant if IV > 2. 


Unfortunately, if the individual equation-of-state parameters ui, are constants one cannot, in general, define 
a single effective equation-of-state parameter w = p/(pc 2 ) that is also independent of cosmic time t. 
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14.2 For a 3-space with a line element of the form 

da 2 — B(r) dr 2 + r 2 d0 2 + r 2 sin 2 6 d<fr , 


show that the non-zero components of the Ricci tensor are 


R,„ = 


1 dB 
rB dr ’ 


Ran — 


1 


r dB 
Ttf-ltr' 


sin 


Hence show that if the 3-space is maximally symmetric then B(r) must take the 
form 

B(r) = 1 


A-Kr 2 ’ 


where A and K are constants. 

14.3 In a four-dimensional Euclidean space with ‘Cartesian’ coordinates ( w , x, y, z), a 
3-sphere of radius R is defined by w 2 + x 2 + y 2 + z 2 — R 2 - Show that the metric 
on the surface of the 3-sphere can be written in the form 


da 2 = R 2 [d\ 2 + sin 2 x(d0 2 + sin 2 6 d/</> 2 )] . 

Show that the total volume of the 3-sphere is V = 2tt 2 R 3 . 

14.4 In a four-dimensional Minkowski space with ‘Cartesian’ coordinates ( w , x, y, z), 
a 3-hyperboloid is defined by w 2 — x 2 — y 2 H— z 2 = R 2 ■ Show that the metric on 
the surface of the 3-hyperboloid can be written in the form 

dcr 2 = R 2 [ dx 2 + sinh 2 x (d6 2 + sin 2 ddcf> 2 )]. 


Show that the total volume of the 3-hyperboloid is infinite. 

14.5 At cosmic time f,, a massive particle is shot out into an expanding FRW universe 
with velocity u, relative to comoving cosmological observers. At a later cosmic 
time t 2 the particle has a velocity v 2 with respect to comoving cosmological 
observers. Show that, at any intermediate cosmic time t, the velocity of the particle 
as measured by a comoving cosmological observer is 

v{t) = . 

at 

Hence show that 

_ RQi ) 

7V’1 R{hY 

where y v = (1 — v 2 /c 2 )~ l/2 and R(t) is the scale factor at cosmic time t. By 
considering the particle momentum, show that as v t —> c the photon redshift 
formula is recovered. 

In the limit z<< 1, show that the look-back time for a galaxy with redshift - is 
to~ t = H o 1 z-Ho l (l + |%)z 2 + --- . 


14.6 
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Show also that, in this limit, the variation of the Hubble parameter with redshift 
is given by 

H(z) = H 0 [l + (l + q 0 )z----]. 


14.7 In a spatially flat FRW geometry, show that the luminosity and angular diameter 
distances to an object of redshift z are given, in the limit z 1, by 

d L= jr | (1 — ^ 0 )z 2 + -••], 

d A = ^-[j-!(3 + g 0 )2 2 + -'-]- 

H o 

Hence show that the angular diameter of a standard object can increase as z 
increases. Do these results still hold in a spatially curved FRW geometry? 

14.8 In the FRW geometry, show that the look-back time to a nearby object at proper 
distance cl is 

d H 0 d 2 

Hence show that the redshift to the object is 

H n d 1 + q 0 Hz.d 2 
c 2 c- 

14.9 The observed flux in the frequency range [v 1 ,v 2 ] received from some distant 
comoving object is given by 


sOi ’ V 2) = f v fobJ v ) dv > 


where / obs (u) is the observed flux density (in Wm 2 Hz 1 ) as a function of 
frequency. If f tm (v) is the emitted (or intrinsic) flux density of the object, show 
that 

^ , /emCO+z)*') 

fob = — Yff~z —’ 


where z is the redshift of the object. If / em (n) oc v a over a wide range of frequencies, 
show that 

F 0 u{ V ^ V l) = K z F em (v x ,V 2 ), 

where the K-correction is given by K = (1 + z)“ _1 . 

14.10 The observed surface brightness 2 obs of an extended object observed in the 
frequency range [zq, v 2 \ is defined as the observed flux per unit solid angle. Thus, 
for a (small) circular object subtending an angular diameter Ad we have 

v ^obsOl’^) 

obs tt(AO) 2 ’ 
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where ^obs( I 'l’ v i) is defined in Exercise 14.9. Show that £ obs can be written as 

y = K z 

ob! ’ 7 T 4tt1 2 (1 + z) 4 ’ 


where l is the physical (projected) diameter of the object, L em (v l ,v 2 ) is the 
intrinsic luminosity of the object in the frequency range [ v { , v 2 \ and K z is the 
^-correction. 

Note: The above result is independent of cosmological parameters. Moreover, 
setting aside the K-correction, the (1 + z ) -4 -dependence means that the surface 
brightness of extended objects drops very rapidly with redshift, making the detec¬ 
tion of high-z objects difficult. 

14.11 A commonly used distance measure in cosmology is the proper-motion distance 
d M defined by 



where v is the proper transverse velocity of (some part of) the object, which is 
assumed known from astrophysics, and 0 is the corresponding observed angular 
velocity. Show that 

^m = (1 + zMa= y 

(1 + z) 

where d A and d L are the angular-diameter distance and the luminosity distance to 
the object respectively. 

14.12 A certain population of galaxies undergoes a short ultra-luminous phase at redshift 
z = z* that lasts for a proper time interval At. After this phase, such galaxies are 
neither created or destroyed. If z* 1, show that for a spatially flat universe the 
total number of such galaxies in the sky that are in this phase is given by 


477c 3 n 0 Ar 


[z*+ §(1 — < 7o)z*H-] ■ 


where n 0 is the present-day proper number density of these galaxies. 

14.13 In the comoving coordinates [x M ] = (t, r, 6, <!>), the FRW metric takes the form 


ds 2 = c 2 dt 2 — R 2 (t) 


- + r 2 (d0 2 - t-shr 


Using the ‘Lagrangian’ method, or otherwise, calculate the corresponding connec¬ 
tion coefficients r"" . Hence calculate the non-zero elements of the Ricci 
tensor R^ v . 

14.14 In Newtonian cosmology, the universe is modelled as an infinite gas of density p(t) 
that is expanding in such a way that the relative recessional velocity of any two 
gas particles is v(t) — H(t)R(t). where H(t) — R(t)/R(t) and R(t) is the separation 
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of the particles at time t. Use Gauss’ law to determine the force on a particle of 
mass m on the edge of an arbitrary spherical region, and hence show that 

47tG 

By considering the total energy E of the above particle, show further that 

8irG 


R~ = 


-pR — c k, 


where the constant k — —2E/(me 2 ). Compare these Newtonian cosmological field 
equations with their general-relativistic counterparts. 

14.15 Show that the relativistic equation of motion for the cosmological fluid is satisfied 
identically and that the relativistic equation of continuity takes the form 




Show further that this equation may also be written in the forms 


cl(pR 3 ) _ 3 pRR 2 ^ d(pR 3 ) _ 3 pR 2 

dt c 2 dR c 2 

14.16 Use the cosmological field equations directly to derive the relativistic equation of 
continuity for the cosmological fluid given in Exercise 14.15. 

14.17 Consider a spherical comoving volume of the cosmological fluid whose surface is 
defined by \ = constant. As the universe expands show that, for the infinitesimal 
time interval t -* t + dt, the conservation of energy requires that 


c 2 pV = c 2 (p + dp)(V + dV)+pdV 


where pc 2 and p are the energy density and pressure of the fluid respectively. 
Hence show that 


dp 

~dR 


-3(l + w)|, 


where w = p/(pc 2 ) and R(t) is the scale factor of the universe. Show that this 
equation has the solution p oc /U 3 ' 1 fu | . 
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In the previous chapter, we considered the geometric and kinematic properties of 
the Friedmann-Robertson-Walker (FRW) metric and derived the cosmological 
field equations for the scale factor R(t). In this chapter, we will use the cosmo¬ 
logical field equations to determine the behaviour of the scale factor as a function 
of cosmic time in various cosmological models. 


15.1 Components of the cosmological fluid 

In a general cosmological model, the universe is assumed to contain both matter 
and radiation. In addition, the cosmological constant A is generally assumed to be 
non-zero. As discussed in Section 8.7, the modern interpretation of A is in terms 
of the energy density of the vacuum, which may also be modelled as a perfect 
fluid (with a peculiar equation of state). Thus, one usually adopts the viewpoint 
that the cosmological fluid consists of three components, namely matter, radiation 
and the vacuum, each with a different equation of state. The total equivalent mass 
density is simply the sum of the individual contributions, 

P(0 = Pm(0 + Pr(0 + PA(0> (15-1) 

where t is the cosmic time and we have adopted the commonly used cosmological 
notation for the equivalent mass densities of matter, radiation and the vacuum 
respectively. Moreover, we shall assume that these three components arc non¬ 
interacting (see Section 14.13); although matter and radiation did interact in the 
early universe, this is a reasonable approximation for most of its history. 

As mentioned in Section 14.12, each component of the cosmological fluid is 
modelled as a perfect fluid with an equation of state of the form 

Pi = WjPjC 2 , 
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where the equation-of-state parameter is a constant (and i labels the compo¬ 
nent). In particular w t = 0 for pressureless ‘dust’, w, = | for radiation and w, = — I 
for the vacuum. In general, if w>, is a constant then, requiring that the weak energy 
condition is satisfied (see Exercise 8.8) and that the local sound speed (dp/ dp) 1 / 2 
is less than c, one finds that must lie in the range — I < uy< 1. We see that 
this is indeed the case for dust, radiation and the vacuum. We now discuss each 
of these components in turn and conclude with a description of their relative 
contributions to the total density as the universe evolves. 


Matter 

In general, matter in the universe may come in several different forms. In addi¬ 
tion to the normal baryonic matter of everyday experience (such as protons and 
neutrons), the universe may well contain more exotic forms of matter consisting 
of fundamental particles that lie beyond the ‘Standard Model’ of particle physics. 
Indeed, observations of the large-scale structure in the universe suggest that most 
of the matter is in the form of non-baryonic dark matter, which interacts electro- 
magnetically only very weakly (and is hence invisible or ‘dark’). Moreover, dark 
matter may itself come in different forms, such as cold dark matter (CDM) and 
hot dark matter (HDM), the naming of which is connected to whether the typical 
energy of the particles is non-relativistic or relativistic. We shall not pursue this 
very interesting subject any further here 1 but merely note that the total matter 
density (at any particular cosmic time t) may be expressed as the sum of the 
baryonic and dark matter contributions, 

Pm(0 = PbO) + Pdm( ? )- 


In the following discussion, we will not differentiate between different types of 
matter, since it is only the total matter density that determines how the scale factor 
R(t) evolves with cosmic time t. We shall also make the common assumption that 
the matter particles (in whatever form) have a thermal energy that is much less 
than their rest mass energy, and so the matter can be considered to be pressureless, 
i.e. dust. In this case the equation of state parameter is simply w = 0. Thus, from 
(14.44), if the matter has a present-day proper density of p m (f 0 ) = p m 0 , its density 
at some other cosmic time t is given by 


Pm (0 — Pm,0 



Of Pm(z)=Pm,o(l+z) 3 , 


For a full discussion, see (for example) T. Padmanabhan, Structure Formation in the Universe, Cambridge 
University Press, 1993. 
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where in the second expression we have used (14.17) to write the density in terms 
of the redshift z. These expressions concur with our expectation for the behaviour 
of the space density of dust particles in an expanding universe. 

Radiation 

The term radiation naturally includes photons, but also other species with very 
small or zero rest masses, such that they move relativistically today. An example 
of the latter is neutrinos (which may in fact have a small non-zero rest mass). The 
total equivalent mass density of radiation in the universe at some cosmic time t 
may then be written as the sum of the photon and neutrino contributions: 

Pr(0 =Py(t) + Pv(.t)- 

Once again, we will not differentiate between different types of radiation in our 
subsequent discussion, since it is only the total energy density that determines 
the behaviour of the scale factor. For radiation, in general, we have w = \. Thus, 
from (14.44), if the total radiation in the universe has a present-day energy density 
of p r 0 c 2 then, at other cosmic times, 



In this case, the variation in the space density of photons (for example) again goes 
as (1 + z) 3 , but there is an additional factor 1 + z resulting from the cosmological 
redshift of each photon. 

It is worth noting that, to a very good approximation, the dominant contribution 
to the radiation energy density of the universe is due to the photons of the 
cosmic microwave background (CMB). This radiation is (to a very high degree 
of accuracy) uniformly distributed throughout the universe and has a blackbody 
form. For blackbody radiation, the number density of photons with frequencies 
in the range \v, v + dv] is given by 

87 TV 2 

n(v, T)dv= ———— - -dv, (15.2) 

C 3 ( e hv/kT _ ^ 

where T is the ‘temperature’ of the radiation. Since the energy per unit frequency 
is simply u{y, T) = n{v, T)hv , the total equivalent mass density of the radiation is 

1 r°° , aT 4 

p r (T) = — u v dv = —, 

c z J 0 c~ 

where a = 47r 2 kg/(60/i 3 c 3 ) is the reduced Stefan-Boltzmann constant. Obser¬ 
vations show that the CMB is characterised by a present-day temperature 




15.1 Components of the cosmological fluid 


389 


T 0 = 2.726 K, which corresponds to a total present-day number density n y 0 ~ 
4 x 10 8 m” 3 . It is easily shown that the CMB photon energy distribution retains 
its general blackbody form as the universe expands. Thus, at any given cosmic 
time t, the temperature of the CMB radiation in the universe is given by 


T(t) = To 


IfL 

m 


or T(z) = T 0 ( 1+z), 


(15.3) 


from which we see that the universe must have not only been denser in the past, 
but also ‘hotter’. 


Vacuum 


As mentioned above the vacuum can be modelled as a perfect fluid having 
an equation of state p = —pc 2 , so that the fluid has a negative pressure. This 
corresponds to an equation of state parameter w= — 1. Thus, from (14.44), we 
see that at any cosmic time t, we have 


Pa — P a,o — 


Ac 2 

8ttG 


Thus, the energy density of the vacuum always has the same constant value. 


Relative contributions of the components 
On combining the above results, we find that the variation in the total equivalent 
mass density (15.1) may be written as 

l4 



' Ro ' 

3 

' R 0 ' 

Pm.O 

[m\ 

+ Pr,0 

lm\ 


+ Pa,i 


(15.4) 


From this expression, we see that the relative contributions of matter, radiation 
and the vacuum to the total density vary as the universe evolves. The details 
clearly depend on the relative values of p m 0 , p r 0 and p A 0 . Typically, however, 
one would expect radiation to dominate the total density when R{t) is small. As 
the universe expands, the radiation energy density dies away the most quickly 
and matter becomes the dominant component. Finally, if the universe continues 
to expand then the matter density also dies away and the vacuum ultimately 
dominates the energy density. We conclude by noting that cosmologists often 
define the normalised scale parameter 


m 

a(t) = 

K o 

in terms of which the above results are more compactly written, since a {) = I by 
definition. We shall make use of this parameter further in subsequent sections. 
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15.2 Cosmological parameters 


In our very simplified model of the universe discussed above, its entire history 
is determined by only a handful of cosmological parameters. In particular, if one 
specifies the values of the equivalent mass densities p m (t*), p r (/*) and p A at 
some particular cosmic time t. t then the value of each density, and hence the total 
density, is determined at all other cosmic times t. Thus, specifying these quantities 
is sufficient to determine the scale factor R(t) at all cosmic times. It is most 
natural to take t. t to be the present-day cosmic time t 0 , and so the cosmological 
model is entirely fixed by specifying the three quantities 


Pm,0’ Pr.O’ P A,0- 


It is, however, both convenient and common practice in cosmology to work 
instead in terms of alternative dimensionless quantities, usually called density 
parameters or simply densities, which are defined by 


a,(t) 


8 7 tG 
3 H 2 (t) 


PM 


(15.5) 


where H(t) is the Hubble parameter and the label i denotes ‘m\ ‘r’ or ‘A’. It 
is worth noting that H A (t) is, in general, a function of cosmic time t (unlike 
p A , which is a constant). In terms of these new dimensionless parameters, the 
cosmological model may thus be fixed by specifying the values of the four 
present-day quantities 


Ho, ftr.o. (15.6) 

A major goal of observational cosmology is therefore to determine these quantities 
for our universe. Significant advances in the last decade mean that cosmologists 
now know these values to an accuracy of just a few per cent. 2 We simply note 
here that 

H q ~ 70kms _1 Mpc -1 , a m0 ~0.3, H r0 ~5xlO -5 , ft A>0 »0.7, 

(15.7) 


where the units of H Q arc those most commonly used in cosmology, in which 
1 Mpc = 10 6 parsecs ~ 3.09 x 10 22 m; in SI units, H 0 ~ 2.27 x 10 -18 s -1 . Perhaps 
most astonishing is that the present-day energy density of the universe is domi¬ 
nated by the vacuum! 


How these observational advances have been achieved is discussed in, for example, J. Peacock, Cosmological 
Physics, Cambridge University Press, 1999 or P. Coles & F. Lucchin, Cosmology: The Origin and Evolution 
of Cosmic Structure (2nd edition), Wiley, 2002. 
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We also note, for completeness, that cosmologists define further analogous 
dimensionless density parameters for the individual contributions to the matter 
and the radiation. For example, ft b , O dm and Ll„ arc commonly used to denote 
the dimensionless density of baryons, dark matter and neutrinos respectively. For 
our universe, cosmological observations suggest the present-day values 

ft b ,o~0.05, n dm . 0 ^0.25, fl„ >0 ~ 0, (15.8) 

noting, in particular, that only around one-sixth of the matter density is in the form 
of the familial - baryonic matter. Moreover, the majority of the baryonic matter 
seems not to reside in ordinary (hydrogen-burning) stars; the contribution of such 
stars is only O* ~ 0.008. The values of the individual quantities (15.8) affect the 
astrophysical process occurring in the universe and have a profound influence on, 
for example, the formation of structure. For determining the overall expansion 
history of the universe, however, only the quantities (15.6) need be specified. 

The reason for defining the densities (15.5) becomes clear when we rewrite 
the second of the cosmological field equations (14.36) in terms of them. Dividing 
this equation through by R 2 and noting that H = R/R , we obtain 

1 = - H 2ft2 ’ ( 15 - 9 ) 

where, for notational simplicity, we have dropped the explicit time dependence of 
the variables. Indeed, it is also common practice to define the curvature density 
parameter 

(15.10) 

so that, at all cosmic times t, we have the elegant relation 

O m + + I1 A + Cl/, = 1. (15.11) 

It should be noted that, in cosmological models with positive spatial curvature 
(k — 1), the parameter il k is negative. Moreover, if the cosmological constant 
A is negative then so too is the vacuum density parameter 11 A . This behaviour 
should be contrasted with that of and fl r , which are always positive. 

From (15.9), we see that the values of O m , 11, and ft A determine the spatial 
curvature of the universe in a simple fashion. We have three cases: 

n m + a r + a A < 1 o negative spatial curvature (k = —1) <£> ‘open’, 
n m + fl r + O a = 1 o zero spatial curvature (k = 0) ‘flat’, 

n m + a r + a A > 1 o positive spatial curvature (& = 1) <=> ‘closed’. 
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The above relations arc valid at any cosmic time t but arc most often applied to 
the present day, t = t Q . In particular', it is also clear from (15.9) that, although 
the density parameters f2 m , O, and 12 A are all, in general, functions of cosmic 
time t, their sum cannot change sign. Thus, the universe cannot evolve from one 
form of the FRW geometry to another. We note that cosmologists often add to 
the plethora of density parameters by also defining the total density parameter 


12 = i2 m + i2 r + n A — 1 — 12^,, 


(15.12) 


which is related to the total equivalent mass density (15.1) by 12 = &TrGp/(3H 2 ). 
From (15.7), we see that for our universe 12 0 ~ 1 or equivalently il k 0 ~ 0 , and 
it is therefore close to being spatially flat (k = 0). 

Finally, it is worth noting that, for any cosmological model to be spatially 
flat, one requires 12 = 1 and it is common to describe the corresponding total 
equivalent mass density as the critical density, which is given by 

_ 3 H 2 
Pcrit= 8^G' 


Hence, for any given value of the Hubble parameter, this expression gives the total 
equivalent mass density required for the universe to be spatially flat. Since recent 
cosmological observations suggest that our universe is indeed close to spatially 
flat and, (15.7), that H 0 ~ 70kms -1 Mpc -1 , one finds that the present-day total 
equivalent mass density in our universe is 

Pcrit.o = ~ 9.2 x 10" 27 kg nr 3 . 

As mentioned above, it is thought that only around 30 per cent of this equivalent 
mass density is in the form of matter and only around 5 per cent in the form of 
baryonic matter. Nevertheless, it is worth noting that p crit 0 ~ 5.5 protons m 3 , 
and so the critical density turns out to be extremely low by laboratory standards. 3 


15.3 The cosmological field equations 


Since the cosmological model can be fixed by specifying the values of the quan¬ 
tities listed in (15.6), it is worthwhile rewriting the cosmological field equations 
(14.36) in terms of these parameters. Let us begin with the second field equation. 
Recalling that H = R/R, this may be written 


H 2 


SttG 

~Y~ 


(E,p.) 


c 2 k 

It 2 ’ 


The fact that this is a number of order unity is an accident of our choice of units! 
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where the label i includes matter, radiation and the vacuum. From (15.4), (15.5) 
and (15.10), we therefore find 


H 2 — Hq (ft m>0 Cl 3 + G r 0 a 4 + n A;0 + CL k 0 a 2 ) , 


(15.13) 


where we have written the result in terms of the dimensionless scale parameter 
a = R/Rq. It should be remembered that fl k 0 = 1 — ft m 0 — — fi A o an d ma y 

be considered merely as a convenient shorthand. It is also worth noting that, since 
a = R/Rq = (1 + z) - 1 , equation (15.13) immediately yields an expression for the 
Hubble parameter H(z) as a function of redshift z. 

We now turn to the first cosmological field equation in (14.36). Multiplying 
this equation through by R/R 2 and again noting that H = R/R, we have 


RR 

~¥ 


47 tG 

3 H 2 


E,P/( 1 + 3^/), 


where the label i once more includes matter, radiation and the vacuum. The left- 
hand side is equal to minus the deceleration parameter q defined in (14.19). Thus, 
substituting the appropriate value of w, for each component and using (15.5), one 
finds the neat relation 


— 2 (fl m + 2H r 2fl A ). 


(15.14) 


If desired, one can easily write this equation explicitly in terms of the present-day 
values of the density parameters by using the result (15.13) and the relation 


a, = n (0 


^Vsd+u,^ 


which holds generally for matter, radiation and the vacuum. 


15.4 General dynamical behaviour of the universe 

The cosmological field equations (15.13) and (15.14) allow us to determine the 
general dynamical behaviour and the spatial geometry of the universe for any 
given set of values for the parameters O m 0 , H r 0 and H A 0 . The observations 
(15.7) suggest that the present-day value of the radiation density ll r 0 is signifi¬ 
cantly smaller than the matter and vacuum densities. It is therefore a reasonable 
approximation to neglect ll r 0 and parameterise a universe like our own in terms of 
just H m 0 and (1 A 0 (and H 0 , which is irrelevant for our discussion in this section). 

Figure 15.1 presents a summary of the properties of FRW universes dominated 
by matter and vacuum energy (known as Lemaitre models) as a function of 
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Figure 15.1 Properties of FRW universes dominated by matter and vacuum 
energy, as a function of the present-day density parameters ll m 0 and H A 0 . The 
circle indicates the region of the parameter space that is consistent with recent 
cosmological observations. 


position in the (ll m 0 ,11 A 0 ) parameter space. The dividing lines between the 
various regions may be determined from the field equations (15.13, 15.14) and the 
relation (15.11). In particular, the ‘open-closed’ line comes directly from (15.11) 
evaluated at the present epoch, which gives the condition 

^A.O = 1 — ^m,0- 

Similarly, the ‘accelerating-decelerating’ line is obtained immediately by setting 
q 0 = 0 in (15.14) for t = t 0 , which gives 

^A,0 = 

The ‘expand-forever-recollapse’ line and the ‘big-bang-no-big-bang’ line require 
a little more work, as we now discuss. 

In fact, both these lines are determined from the expression (15.13) for the 
Hubble parameter. In particular, the condition for the graph of R{t), or equivalently 
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of aft), to have a turning point at some cosmic time t = t. t is simply that //(/*) = 0. 
Setting n* o = 1 - O m 0 — fl A 0 in (15.13), we find that, after rearranging, this 
condition corresponds to 

f(a ) = O A0 + (1 ~ ^m,0 — ^A,o) a + ^m,0 = 0- (15.15) 


This is a cubic equation for the value(s) of the scale factor a = a* at which aft) 
has a turning point. We arc not, in fact, interested in the particular' value(s) a = a* 
that solve (15.15) but only in whether a (real) solution exists in the region a > 0 
(which is the only physically meaningful regime). 

For the case O a 0 < 0 we may deduce immediately from (15.14) that the 
universe must have started with a ‘big bang’, at which a = 0, and must eventually 
recollapse in a ‘big crunch' as a —»■ 0 once more. In (15.14), a negative value of 
li \ means that the deceleration parameter q is always positive. Thus d is always 
negative, and hence the aft) graph must be convex for all values of t. Since at 
the present epoch d(t 0 ) > 0 (because we observe redshifts, not blueshifts), this 
means that a(t) must have equalled zero at some point in the past, which it is 
usual to take as t — 0; 4 si mi lar reasoning may be used to deduce that the universe 
must eventually recollapse, although a little more care is required in this case. As 
the universe expands, the vacuum energy eventually dominates and so we need 
only consider the fl A -term on the right-hand side of (15.14), which will not tend 
to zero as the scale factor increases. Thus, d cannot tend to zero and so a —»■ 0 at 
some finite cosmic time in the future. 

In our further analysis, we now need only consider the case in which Cl A 0 > 0 
in (15.15), but this still requires some care. Let us first consider the case for 
which ff A o = 0- Immediately, we see that equation (15.15) then has the single 
solution a * = O m 0 /(H m 0 — 1), which is negative in the range 0 < 0 < 1, 

indicating that there is no (physically meaningful) turning point. Therefore, over 
this range, the ‘expand-forever-recollapse’ line is simply given by fl A 0 = 0. We 
must now address the far more complicated case for which fl A 0 > 0. In this case 
f(a) — > ±oo as a — »■ ±oo. Moreover /(0) = O m 0 , which is positive. Thus, for 
f(a) to have a positive root, it must have a turning point in the region a > 0. 
On evaluating the derivatives f'(a ) and f"(a) with respect to a, it is clear that, 
in the limiting case of interest, f(a) must have the general form illustrated in 
Figure 15.2. Thus, we require /(a*) = f'(a *) = 0, which quickly yields 


a 


* 


n \ 1/3 
Hm.O \ 

2^a,o/ 


(15.16) 


4 In fact, this reasoning is still valid in the case fl A 0 = 0, provided that the universe contains even an 
infinitesimal amount of matter (or radiation). Thus all cosmological models with A < 0 have a big-bang 
origin at some finite cosmic time in the past. 
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On substituting this expression back into (15.15) one then obtains a separate cubic 
equation for fl Aj0 , given by 

4(1 - O m ,o - IVo) 3 + 27ni >0 n A>0 = 0. (15.17) 


By introducing the variable x = [fl A 0 /(4ft m q)] 1 / 3 , this equation quickly 
reduces to 


-VV - - in, Q 

" 4 " 4H m , 


which is amenable to analysis using the standard formulae for finding the roots 
of a cubic. In particular, rewriting the resulting roots in terms of fl A 0 , one finds 
the following three cases: 
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Moreover, from (15.16), one easily finds that a* < 1 for (15.18, 15.19), whereas 
a* > 1 for (15.20). Since the universe is expanding, a* < I corresponds to a 
turning point in the past (i.e. no big bang), whereas a* > 1 corresponds to a 
turning point in the future (i.e. recollapse). 

The resulting lines, plotted in Figure 15.1, show some interesting features. In 
particular, we note that when O A0 = 0 there is a direct correspondence between 
the geometry of the universe and its eventual fate. In this case, open universes 
expand forever, whereas closed universes recollapse. This correspondence no 
longer holds in the presence of a non-zero cosmological constant, in which case 
any combination of spatial geometry and eventual fate is possible. It is also 
worth noting that the region of the (ft m 0 , fl A 0 )-plane consistent with recent 
cosmological observations is centred on the spatially flat model (0.3,0.7) and 
excludes the possibility of a zero cosmological constant at high significance. 
These observations also show the expansion of the universe to be accelerating. 
They also require the universe to have started at a big bang at some finite cosmic 
time in the past and to expand forever in the future. 


15.5 Evolution of the scale factor 

So far, we have considered only the limiting behaviour of the (normalised) 
scale factor a(t) for different values of the cosmological parameters; this was 
summarised in Figure 15.1. We now discuss how to find the form of the a{t)- 
curve at all cosmic times, for a given set of (present-day) cosmological parameter 
values. This behaviour is entirely determined by the cosmological field equation 
(15.13). Remembering that H = a/a, this may be written as 

^ =#()(^m,0 <7 ' + ^r,0° ~ + ^A,0 a ~ + 1 — ^m,0 — ^r,0 — ^A,o) • 

(15.21) 

Instead of working directly in terms of the cosmic time t, it is more convenient 
to introduce the new dimensionless variable 


( da 
dt 


t — Ho(t ~ *o)> 


(15.22) 


which measures cosmic time relative to the present epoch in units of the ‘Hubble 
time’ Hq 1 . In terms of this new variable, (15.21) becomes 



— ftm.O* 7 1 +^r,0 fl “ + ^A.0 a " + 1 “ ^m,0 ~ ^r.O — ^A,0- 


(15.23) 


There exist some special cases, where Fl m 0 , f\,o an d FIa.o la ^c on particular 
simple values, for which equation (15.23) can be solved analytically; we will 
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discuss some of these cosmological models in Section 15.6. In general, however, a 
numerical solution is necessary. Starting at the point t = 0 (the present epoch), for 
which a Q = 1, the normalised scale factor at time step n + I can be approximated 
by the Taylor expansion 


a n +1 


a n + 




(15.24) 


where A? is the (small) step size in t. The coefficient of A? is given by (15.23), 
and the coefficient of (A?) 2 may be obtained by differentiating (15.23). The latter 
is important since, without the (A?) 2 term, equation (15.24) would not carry the 
integration correctly through a value of a for which da/dt is small or zero. 

Figure 15.3 shows the variation in the normalised scale factor aft) as a function 
of t for different values of (ft m 0 , fl A 0 ) as indicated, assuming that f) r 0 is 
negligible, as it is for our universe. In the top panel l! m 0 + o = I ' n cac 'h case, 
so each universe has a flat spatial geometry (k = 0). The solid line corresponds 
to the case (0.3,0.7), which is preferred by recent cosmological observations. An 
interesting cosmological ‘coincidence’ for this model is that the present epoch, 
t = 0, corresponds almost exactly to the point of inflection on the a(t) curve. 
A second such ‘coincidence’ is that the age of the universe in this model (i.e. 
the time since the big bang) is very close to one Hubble time. 5 The broken-and- 
dotted line in the top panel of the figure corresponds to the case (0,1), which is 
known as the de Sitter model and will be discussed further in Section 15.6. For 
the moment, we simply note that this model has no big-bang origin (although 
a —»■ 0 as t —»■ — oo) and will expand forever. The broken line in the top panel 
corresponds to the case (1,0), which is known as the Einstein-de-Sitter model and 
will also be discussed in Section 15.6. As we see from the figure, this model does 
have a big-bang origin. It is also on the borderline between expanding forever 
and recollapsing; it will in fact expand forever, but a 0 as t —► oo. 

In the bottom panel of Figure 15.3 we have fl m 0 + fl A0 ^ 1 in each case, 
and so each universe is spatially curved; in particular the case (0.3,0) is open 
and the cases (0.3,2) and (4,0) arc closed. We see that the case (0.3, 2) has no 
big-bang origin, and is, in fact, what is known as a bounce model, where the 
universe collapses from large values of the scale factor and ‘bounces’ at some 
finite minimum value of a, after which it re-expands forever. Conversely, the case 
(4, 0) corresponds to a cosmological model with a big-bang origin that expands 
to some finite maximum value of a before recollapsing to a big crunch. 

Before going on to discuss cosmological models that admit an analytic solution 
for a(t), it is worth discussing the general case in the limit a — 0. Whether 


5 Whether such coincidences have some deeper significance is the subject of current cosmological research. 
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Figure 15.3 The variation in the normalised scale factor as a function of the 
dimensionless variable H 0 (t—t 0 ) for different values of (fl m0 ,O A0 ) as indi¬ 
cated, assuming that 0 is negligible. Top panel: fi m0 + ll v0 = 1 in each 
case, so the universes have a flat spatial geometry (k — 0). Bottom panel: 
fl m 0 + ft A 0 ^ 1 in each case, so the universes are spatially curved; in particular, 
the case (0.3, 0) is open and the cases (0.3, 2) and (4, 0) are closed. 


considering the big bang or the big crunch, in this limit we can assume that the 
energy density of the universe is dominated by a one kind of source (which one 
will depend on the particular cosmological model under consideration). In this 
case, (15.23) can be written 

=ft i -, 0 «- (1+3u, ' ) + %o, (15-25) 

where the label i denotes the dominant form of the energy density as a —»■ 0 
and is the corresponding equation-of-state parameter. Moreover, if we restrict 
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our attention to the (realistic) case, in which i denotes either dust (w i = 0) or 
radiation (w t — |), then the first term on the right-hand side of (15.25) dominates 
as a —> 0, so we can neglect the curvature density Ll k 0 . In this case, (15.25) can 
be immediately integrated to give 


a(t) = ± §( 


1 2/[3(l+u; I -)] 


(?-?„) 2 / 13 ( 1 +^)], 


(15.26) 


where t* is the value of t at which a = 0 and the plus and minus signs correspond 
to the big bang and big crunch respectively. From (15.25), we also note that 
da/dt —»■ oo as a —»■ 0. Thus, we conclude that the a(?)-graph meets the t -axis at 
right angles. 


15.6 Analytical cosmological models 

Although in the general case the evolution of the (normalised) scale factor a(t) 
must be determined numerically, there exist a number of special cases, corre¬ 
sponding to particular values of the cosmological parameters O m 0 , O r 0 and 
0 , for which equation (15.23) can be solved analytically. We now discuss 
some of these analytical cosmological models, all of which have inherited special 
names that arc widely used in cosmology. In this section, we will work in terms 
of the cosmic time / directly, rather than the dimensionless variable t defined 
in (15.22). 


The Friedmann models 

Cosmological models with a zero cosmological constant (and, strictly, a non-zero 
matter or radiation density) arc known as the Friedmann models. As noted in 
Section 15.4, all Friedmann models have a big-bang origin at a finite cosmic time 
in the past. Moreover, it is possible to place a strict upper limit on the age of the 
universe in such models. Since the a (f)-curve is everywhere convex, it is clear 
from Figure 15.4 that it crosses the /-axis at a time that is closer to the present 
time t = t 0 than the time at which the tangent to the point (t 0 , a 0 ) reaches the 
/-axis (note that a 0 = 1). Clearly, the point where the tangent meets the /-axis is 
the point at which a(t) would have been zero for a = constant and a = 0. The 
time elapsed from that point to the present epoch is simply d(t 0 )/a(t Q ) = H {) 1 . 
Thus, in Friedmann models, the age of the universe must be less than the Hubble 
time: 

f o < 1 . 
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Figure 15.4 Diagram to illustrate that, for all Friedmann models, the age of the 
universe is less than the Hubble time 1 /H 0 . 

The behaviour of a{t) near the big-bang origin is given by (15.26) and is 
independent of the curvature density parameter Cl k 0 = 1 — f! m 0 — fl r 0 (and hence 
of the sign of k). The future evolution, however, depends crucially on this constant. 
From (15.21) we can distinguish three possible histories, depending on the value 
of 1\ 0 : 

Fl k 0 > 0 open (k— —1) a —> non-zero constant as a —»■ oo, 

kl k 0 = 0 flat (/r = 0) a —0 as a —»■ oo, 

(l k 0 < 0 closed (k = 1) a = 0 at some finite value r/ max . 

Thus, we see the main feature of Friedmann models, namely, that the dynamics of 
the universe is directly linked to its geometry. The three cases above are illustrated 
in Figure 15.5. We shall now find explicit analytical solutions for a(t) in the 
special cases of a dust-only and a radiation-only Friedmann model. We will also 
obtain an analytic form for t as a function of a for the case of a spatially flat 
(k = 0) Friedmann model containing both matter and radiation. 



which may be integrated straightforwardly in each of the three cases fl m 0 = 1, 
fl m 0 > 1 and n 0 < 1 respectively, as follows. 




402 


Cosmological models 



Figure 15.5 Schematic illustration of the evolution of the normalised scale factor 
a(t) in closed, open and spatially flat Friedmann models. 


• For (l ln 0 =1 (k — 0) the solution is immediate, and we find that 


a(t) = (| H 0 t) 


(15.28) 


This particular case is known as the Einstein-de-Sitter (or EdS) model. 

• For ri m0 > \(k — 1) the integral (15.27) can be evaluated by substituting x = 
[fl m 0 /(n m0 - 1)] sin 2 (i/f/2), where f p is known as the development angle and varies 
over the range [0, tt]. One then obtains 




2(n m> 0 -i) 


(1 — COS i p). 




2H 0 (fl m0 — l) 3 / 2 


(tp — sin ip). 


which shows that the graph of a(t) is a cycloid. 

• For ft m 0 < I (k — —1) the integral (15.27) can be evaluated by substituting x = 
[fl m 0 /(l — ^m.o)] sinh 2 (i/f/2), and one obtains 




2(1 — ft m 0 ) 


(cosh ip — 1), 




2ff 0 (i-n_ 0 )3/2 


(sinh ip — ip). 


In each case, one may also obtain expressions for p m (t) = p m 0 a 3 and H(t) = 
a/a, and hence for (l m (f). 
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Radiation-only Friedmann models (H A () = 0, 0 = 0) In this case (15.21) 

becomes 

a 2 = Hq (n r 0 a~ 2 + 1 — O r 0 ) => t— — / — dx, 

H ° J ° v /n r;0 + (1 -a r;0) x2 

(15.29) 

which may again be integrated straightforwardly for fl r 0 = 1, fl r 0 > 1 and 
U r0 < 1 respectively. 

• For H r o = I (k = 0) the solution is again immediate, and we find that 

a(t) = (2// o 0 1/2 . 

• For ft r o <l(k = —1) and fl r0 > \{k = 1) the integral (15.29) can be evaluated 
by inspection to give 



In each case, one may again obtain expressions for p r (t) = p m Q a 3 and H(t ) = 
a/a, and hence for Fl r (t). 


Spatially flat Friedmann models (11 A () = 0, ft m 0 + O r 0 = 1) In this case (15.21) 
becomes 


a 2 = H^(CL m 0 a l + Fl r 0 a 2 ) 


-1 r / X dx. 

Hq j o y 0 

(15.30) 


which may be straightforwardly integrating by substituting y = ft m 0 x + fl r 0 to 
obtain 


H()t — —2— + ^r,o) 1/ '(^m,O fl,— 2n r 0 )+2Fl r> / 0 j . 


Unfortunately, this expression cannot be easily inverted to give a(t). Nevertheless, 
it is simple to show that the above expression becomes | a 3 / 2 for a matter-only 
model and \a 2 for a radiation-only model, and therefore agrees with our earlier 
results. 
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The Lemaitre models 

The Lemaitre models arc a generalisation of the Friedmann models in which the 
cosmological constant is non-zero. In particular, we will focus here on matter- 
only models (fl r 0 = 0), although our discussion is easily modified for radiation- 
only models, and can be extended to include models containing both matter and 
radiation. The general dynamical properties of Lemaitre models with ft r 0 = 0 
were discussed in detail in Section 15.4, with particular focus on their limiting 
behaviour. We concentrate here on determining the generic form of the n(t)-curve 
for models of this type that have a big-bang origin and will expand forever. 
We begin by considering the general case of arbitrary spatial curvature and then 
specialise to the spatially flat case. A model of the latter sort appeal's to provide 
a reasonable description of our own universe, if one neglects its radiation energy 
density. 


Matter-only Lemaitre models with arbitrary spatial curvature (fl r 0 = 0) In this 
case the cosmological field equation (15.13) reads 

o 2 = Hq (n m0 a * + + ^Vo) > (15.31) 

where Ll k 0 = 1 — ft m 0 — 0 . Obtaining explicit formulae giving, for example, 

the scale factor as a function of time is in general quite complicated, since the 
integrals turn out to involve elliptic functions, 6 which are unfamiliar to most 
physicists these days. Nevertheless, we see that for small a the first term on the 
right-hand side dominates and the equation is easily integrated. Thus, after starting 
from a big-bang origin at t — 0, the a(t)-curve at first increases as 


(for small t). 


which agrees with our earlier result (15.26). As the universe expands, however, 
the matter energy density decreases and the vacuum energy eventually dominates. 
Thus, for large t (and hence large a ), the second term on the right-hand side of 
(15.31) dominates. Once again the equation is then easily integrated to give 


a{t) or exp 


( Ho 



(for large t). 


See e.g. M. Abramowitz & I. A. Stegun, Handbook of Mathematical Physics, Dover, 1972. 





15.6 Analytical cosmological models 


405 


From the above limiting behaviour at small and large t, it is clear that the 
universe must, at some point, make a transition from a decelerating to an accel¬ 
erating phase. This occurs when a = 0, at which point the r/(7)-curvc has a point 
of inflection. Differentiating (15.31), we find that 

a = k H 0 ( m A.o a - «m,o«“ 2 ) • (15-32) 


From this result, we may verify immediately that at early cosmic times (when a 
is small) we have a < 0, and so the expansion is decelerating. As the universe 
expands, the deceleration gradually decreases until a changes sign, after which the 
expansion accelerates ever more rapidly. We see that the value of the normalised 
scale factor at which the point of inflection (a = 0) occurs is given by 




n . \ i/3 
212a,o/ 


(15.33) 


It is, in fact, possible to obtain an approximate analytic expression for the 
normalised scale factor a(t) in the vicinity of the point of inflection. To do this, 
we must first obtain an approximate form for the cosmological field equation 
(15.31) in the vicinity of this point. Denoting the cosmic time at the point of 
inflection by f*, we may perform separate Taylor expansions of a and a 2 about 
t = t* to obtain 


a & a* + a*{t — t*) and a 2 & a 2 + a*a*(t — f*) 2 , 


where, for notational convenience, we have written a* = ci(t. f ), a* = d(?*), etc. 
Using the first expression to subtitute for (t — t. t ) 2 in the second, we obtain 


, 2 , «*(o-a *) 2 
a 2 H-;- 


(15.34) 


Differentiating (15.32) one easily obtains an expression for a. Then, substituting 
(15.33) into the resulting expression, and into (15.31), one finds that (15.34) 
becomes 

h 2 fa H 2 [n t>0 + 3n A , 0 a; + 3fl A>0 (a - a *) 2 ]. 


This equation can now be integrated analytically and has the solution 

ait) = a* + a* [l + 0 (±n A O n^ 0 )“‘ /3 ] 7 sinh [H 0 (3£l A>0 ) 1/2 (i - fj] - 


(15.35) 

An interesting property of this type of model is that in the case of positive 
spatial curvature (k = 1), for which () /( 0 < 0, there is a ‘coasting period' in the 
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a(t) 



Figure 15.6 The behaviour of a(t) in the Lemaitre model with k = 1. For k — 0 
or k = — 1, there is no extended coasting period. 

vicinity of the point where a = 0, during which the value of a it) remains almost 
equal to a * (see Figure 15.6). It is easily seen from (15.35) that, by setting the 
value of the quantity y^,o(i^A,o^m o) _1 ^ 3 sufficiently close to —1, one can 
make the coasting period arbitrarily long. Indeed, in the limiting case, it is easy 
to show that one requires that ll m 0 and 11 A 0 should satisfy (15.17). 

Spatially flat matter-only Lemaitre models (fl r 0 = 0, ll m(l + l!\o = 1) In this 
case one can give an explicit formula for the scale factor. Moreover, even if the 
universe turns out not to be exactly spatially flat, recent cosmological observations 
show that it is close enough to flatness for the formulae involved to act as a 
reasonable first approximation and so it is worthwhile to have them available. 

In the spatially flat case, the cosmological field equation (15.13) may be written 

\ r a 

« 2 = ^0 [(! — ^A,o)« _1 + i^A,0« 2 ] => t= TT l dx - 

H ° 0 y (1 -n A>o) + n A>0 *4 

This integral is a little more difficult than those considered earlier, but it can be 
made tractable by the substitution y 2 = x 3 |fl A 0 |/(1 — O a 0 ), which yields 

2 r V fl3 I^A,ol/(i _ ttA,o) dv 

3-v/T^yyoI 0 sj 1 ± y 2 

where the plus sign in the integrand corresponds to the case 0 A () > 0 a nd the 
minus sign to 0 < 0. This may now be integrated easily to give 


H 0 t- 2 

[sinh -1 

[y«3|o Ao |/(i_ OAo) ] 

if n A o > o, 

° 3^|ft A>0 | 

[- 1 [ 

ya 3 |a A , 0 |/(i-a Ai0 )] 

if n A o < o, 


(15.36) 
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which may be inverted to give a(t ) in each case. One can also obtain analytic 
expressions for H(t) and p m (t) (see Exercise 15.24) and thus for Ll m (t) and 

n A (t). 

The de Sitter model 

The de Sitter model is a particular' special case of a Lemaitre model defined 
by the cosmological parameters fl m () = 0, O r 0 = 0 and ft A 0 = 1. This model 
is therefore spatially flat (k = 0) but is not a true cosmological model in the 
strictest sense, since it assumes that the matter and radiation densities arc zero. 
Nevertheless, it is interesting in its own right both for historical reasons and 
because of its close connection with the theory of inflation (see Section 16.1). 
For the de Sitter model, the cosmological field equation (15.13) reads 



which immediate tells us that the Hubble parameter H(t) is a constant and the 
normalised scale factor increases exponentially as 



where, in the second equality, we have expressed the solution in terms of the 
cosmological constant A. Thus, the de Sitter model has no big-bang singularity 
at a finite time in the past. 

Einstein’s static universe 

All the cosmological models that we have constructed so far are evolving cosmolo¬ 
gies. We know now, of course, that the universe is expanding and so there is no 
conflict with the field equations. Nevertheless, it is interesting historically to look 
at Einstein’s static model of the universe. Einstein derived his field equations 
well before the discovery of the expansion of the universe and he was worried 
that he could not find static cosmological solutions. He therefore introduced the 
cosmological constant with the sole purpose of constructing static solutions. 

For A > 0, we seek a solution to the field equations in which the universe is 
static, i.e. a = a = 0. In this case, the Hubble parameter H is zero always, and 
so the dimensional densities in (15.5) are formally infinite. It is more convenient 
therefore to work with the field equations in their original forms (14.36). We see 
immediately that we require 
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In fact, the first equality can be more succinctly written as p m 0 = 2p A 0 . Since 
A is positive we thus require k = 1, and so the universe has positive spatial 
curvature. 

How well did Einstein’s static universe fit with cosmological observations of 
the time? The mean matter density of the universe is still a matter of great debate, 
but recent cosmological observations suggest that 

Pm.o ~ 3 x 10“ 27 kgm -3 . 

In Einstein’s time, this value was estimated only to within about two orders 
of magnitude. Nevertheless, adopting the above value of p m 0 we find that the 
scale factor is R () ~2x 10 26 m ~ 6000 Mpc, which is more than sufficient for the 
closed spatial geometry to be large enough to encompass the observable universe. 
Also A = 1 /R} } = 2.5 x 10 53 m 2 , which is small enough to evade the limits on 
A from Solar System experiments (|A| < 10 -46 m -2 ). Thus the Einstein static 
universe was not immediately and obviously wrong. 

However, aside from the fact that the model disagreed with later observations 
indicating an expanding universe, it has the theoretically undesirable feature of 
being unstable. The cosmological constant must be fine-tuned to match the density 
of the universe. Thus, if we add or subtract one proton from this universe, or 
convert some matter into radiation, we will disturb the finely tuned balance 
between gravity and the cosmological constant and the universe will begin to 
expand or contract. 


15.7 Look-back time and the age of the universe 

Since the cosmological model may be fixed by specifying the values of the four 
(present-day) cosmological parameters H Q , H m 0 , H r 0 and \ it is possible to 
use these quantities to determine other useful derived cosmological parameters. 
In this section we consider the look-back time and the age of the universe. 

In Chapter 14, we showed that if a comoving particle (galaxy) emitted a photon 
at cosmic time t that is received by an observer at t = t 0 then the ‘look-back time’ 
t 0 — t is given as a function of the photon’s redshift by 

'°-'=/o'(irfs(i)' (1537) 

From the cosmological field equation (15.13), on noting that a = R/R 0 = (I + z)~ 1 
we obtain the useful result 


H (z) — Hq [n m 0 (l + z) 3 + H r ,o(l +^) 4 + {2a,o + n* i0 (l +z)‘] • (15.38) 
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Thus, the look-back time to a comoving object with redshift z is given by 


1 r 

t o~ t =T t 

Hc\ J o 


dz 


0- + z)J ft m , 0 (i + z) 3 + n r o(i+z) 4 + n Ai0 + fi fc)0 (i +z) 2 


We note that the differential form of this relation is perhaps more useful since 
one is often interested simply in the cosmic time interval dt corresponding to an 
interval dz in redshift. In any case, a more convenient form of the integral for 
evaluation is obtained by making the substitution x = (z + l) -1 , which yields 


1 r 1 xdx 

( ' 1+ ' 4 yj + ^r,0 + ^A,0 x4 + 


(15.39) 


Assuming 0 = 0 (which is a reasonable approximation for our universe), in 
Figure 15.7 we plot H 0 (t— t 0 ), the look-back time in units of the Hubble time, 
as a function of redshift for several values of H m 0 and 0 . 

In any cosmological model with a big-bang origin, an extremely important 
quantity is the age of the universe, i.e. the cosmic time interval between the point 
when a(t) = 0 and the present epoch t = t 0 . Since " oc at the big bang, we 
may immediately obtain an expression for the age of the universe in such a model 



Figure 15.7 The variation in look-back time, in units of the Hubble time, as 
a function of redshift z for several sets of values (fl m0 ,fl A0 ) as indicated, 
assuming that fl r 0 is negligible. 
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Table 15.1 The age of the universe in Gyr for various 
cosmological models (with Q r 0 = 0) 


^m.O 

^A.O 


H 0 in km s 1 Mpc 1 


50 

70 

90 

1.0 

0.0 

13.1 

9.3 

7.2 

0.3 

0.0 

15.8 

11.3 

OO 

OO 

0.3 

0.7 

18.9 

13.5 

10.5 


by letting z —r oo in (15.39), so that the lower limit of the integral equals zero. 
Since the resulting integral is dimensionless, we can write 

T 0 — Tr/(^m,0’ ^r,0’ ^A,o)’ 

where / is the value of the integral, which is typically a number of order unity. 
The age of the universe is therefore the Hubble time multiplied by a number of 
order unity. For general values of the density parameters ft m 0 , H r 0 and n A>0 , 
it is not possible to perform the integral analytically and so one has to resort to 
numerical integration. Table 15.1 lists the age of the universe t 0 for the same 
values of O m 0 and ft A 0 as considered in Figure 15.7. It is interesting to compare 
these values with estimates of the ages of the oldest stars in globular clusters, 

Istars- H-5±l-3 Gyr, 

where the uncertainty is dominated by uncertainties in the theory of stellar evolu¬ 
tion. Clearly, one requires t 0 > t stars for a viable cosmology! 

It is worth noting that, in our discussion of analytical cosmological models in 
the previous section, we have already performed (a generalised version of) the 
relevant integral required to calculate the corresponding age of the universe in each 
case. Thus, for each model with a big-bang origin for which we have calculated an 
analytical form for a(t) or t(a), the corresponding age of the universe is obtained 
simply by setting t = t 0 and a — 1. For example, from (15.28), the age of an 
Einstein-de-Sitter universe is simply t 0 = 2/(3 H 0 ). Similarly, from (15.36), the 
age of a spatially flat matter-only Lemaitre model with fl A0 > 0 is given by 

2 . ! / G a , 0 2 tanh^yn^ 

tn = - , smh /-=- , -, 

3// 0 ynyo y 1 - ^a,o 3//q y g A j0 

where, in the second equality, we have rewritten the result in a more useful form 
involving n A0 , using standard formulae for inverse hyperbolic trigonometric 
functions. 
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We may also obtain a general expression for the comoving ^-coordinate of a 
galaxy emitting a photon at time t that is received at time t 0 with redshift z. This 
is given by 

r f o edit c r z dz 

X ~Jt R(J)~Y 0 Jo Hit)' 

We may now subsitute for H(z) using the expression (15.38) derived in the 
previous section. Thus the ^-coordinate of a comoving object with redshift z is 
given by 

X(z) = f . _ d \ (15.40) 

° ° 0 yn m0 (i + z ) 3 + fi r0 (i + z ) 4 + n A0 + fi/t, 0 (i+z ) 2 

Once again, the differential form of this result is perhaps more useful, since one 
is often interested in the comoving coordinate interval d\ corresponding to an 
interval dz in redshift. As before, a simpler form for the integral is obtained by 
making the substitution x = (1 + z) _1 , which yields 

M = inrl'-, / 05 - 41 ) 

0 °‘ (1+J o + O A 0 x 4 + O A% 0 x 2 

From (14.29) and (14.31), the corresponding luminosity distance d\ (z.) and angu- 
lar diameter distance d A (z) to the object are given by 


4(z) = ^o(l + z)%W) and d A (z) = —— 5(^(z)), 

1 + z 


where S(x) is given by (14.12), whereas the proper distance to the object is simply 
d{z) = R 0 S(x(z))- It is useful to introduce the notation x(z) = cE(z)/(i? 0 // 0 ), so 
that E(z.) denotes the integral in (15.41). Using the expression (15.10) to obtain 
El k 0 , one can then write 


i? c ( ( \\ c fl n *.ol 1/2 -5(v / l n i%ol £ (-)) for fl^o # 0, 
RqS{x(z)) = — f n n 

H o [£(<:) for U A . 0 = 0, 

which allows simple direct evaluation of d L (z ) and d A (z.) in each case. 

As was the case in the previous section, for general values of fl m 0 , fl r 0 and 
U A o it is not possible to perform the integral (15.41) analytically and so one 
has to resort to numerical integration. Figure 15.8 shows plots of dimensionless 
luminosity distance (c/H {) \ 1 d\ (z) (top panel) and dimensionless angular diame¬ 
ter distance ( c/H Q )~ l d A (z ) (bottom panel) for various values of O m0 and fl Aj0 , 
assuming that ft r 0 is negligible; the solid, broken and dotted lines correspond to 
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Figure 15.8 The variation in dimensionless luminosity distance (top panel) and 
dimensionless angular diameter distance (bottom panel) as functions of redshift, 
for different sets of values (fl m0 ,n A0 ) as indicated, assuming that fl r0 is 
negligible. The solid, broken and dotted lines correspond to spatially flat, open 
and closed models respectively. 


spatially flat, open and closed models respectively. In particular, it is worth noting 
that, for the models with a non-zero matter density, the angular diameter distance 
has a maximum at some finite value of the redshift z = z*. Thus, for a source of 
fixed proper length t, the angular diameter A 6 = i/d A declines with redshift for 
z < z*, as one might naively expect, but then increases with redshift for z > z*. 
A very-high-redshift galaxy (if such a thing existed) would therefore cast a large, 
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but dim, ghostly image on the sky. The physical reason for this is that the light 
from a distant object was emitted when the universe was much younger than it is 
now - the object was close to us when the light was emitted. This, coupled with 
gravitational focussing of the light rays by the intervening matter in the universe, 
means that the galaxy looks big! 

The integral (15.41) can, in fact, be evaluated analytically in some simple cases. 
As an example, consider the Einstein-de-Sitter (EdS) model (fl m () = I, (),. 0 = 0, 
fi A 0 = 0). In this case, we find that 


X(z) = 


c r 1 
RoH 0 Ai+z ) -1 


dx 

y/x 


2c 


R 0 H 0 


[l-(l + z)- 1/2 ]. 


Thus, the luminosity distance in the EdS model is given as a function of z by 

d L (z) = ^-(l + z)[l-(l + z)~ 1/2 ], 

H o 


and the angular diameter distance by 


d\{z) = 


2c 1 


[l-(l + z)- 1/2 ]. 


Note that, in this case, d A (z) has a maximum at a redshift z — 5/4. 

The relations between redshift and luminosity distance (angular diameter 
distance), form the basis of observational tests of the geometry of the universe. All 
one needs is a standard candle (for application of the luminosity-distance-redshift 
relation) or a standard ruler (for application of the angular-diameter-distance- 
redshift relation). Comparison with the predicted relations shown in Figure 15.8 
can then fix the values of tl m 0 and fl A 0 . Unfortunately, standard candles and 
standard rulers are hard to find in the universe! Nevertheless, in recent years 
there has been remarkable progress, using distant Type la supernovae as standard 
candles and anisotropies in the cosmic microwave background radiation as a 
standard ruler. The results of these observations suggest that we live in a spatially 
flat universe with O m 0 ~ 0.3 and O a 0 ~ 0.7. 


15.9 The volume-redshift relation 

In Section 14.10 we found that, at the present cosmic time t 0 , the proper volume 
of the region of space lying in the infinitesmial coordinate range \ —>■ X + dx and 
subtending an infinitesmial solid angle dXl = sin 0 dO d<b at the observer is 

cRpS 2 (x(z)) 


dV 0 = 


dz dXl, 


(15.42) 
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the corresponding volume of this region at a redshift z being given by dV(z) = 
dV o/(l + ") 3 . We may now express dV {) in terms of the cosmological parameters 
H 0 , O m 0 , fl r o and fl A 0 . Using the expressions (15.40), (15.38) and (15.10) for 
X(z), H(z.) and il k respectively, we find immediately that 


_ W) 3 1 

[i^or 1 ^ (vra^z)) 

for Sl k ' 0 # 0, 

5 

c 

1 

\e 2 (z) 

for Cl k o = 0, 


(15.43) 


where we have defined the new function 

h(z) = ^ = J 0 (1 + z ) 3 + U r ,o(l + ^) 4 + U A ,o + El k 0 (1 +z ) 2 

M 0 * 

and E(z) = / 0 dz/h(z ) is the function defined in the previous section. 

For general values of O m 0 , U r() and fl A 0 , one must once again resort to 
numerical integration to obtain dV Q . In Figure 15.9, we plot the dimensionless 
differential comoving volume element (c/H 0 )~ 3 dV 0 /(dzdCl) as a function of 
redshift z for several values of tl m 0 and U A 0 , assuming that tl r 0 = 0. In 
particular, we note that, in the currently favoured case (ll m 0 ,1> A 0 ) = (0.3,0.7), 
we may explore a large comoving volume by observing objects in the redshift 
range z = 2-3. 



Figure 15.9 The variation in the dimensionless differential comoving volume 
element as a function of redshift z for several sets of values (O ln 0 , lt A 0 ) as 
indicated, assuming that fi r 0 is negligible. 
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For the majority of our discussion so far, we have concentrated on exploring 
cosmological models with properties determined by fixing the values of the 
present-day densities M m 0 , O r 0 and a A 0 . From the definition (15.5), however, it 
is clear that each density is, in general, a function of cosmic time t. It is therefore 
of interest to investigate the evolution of these densities as the universe expands. 

From (15.5) we have 


ih(t) 


SttG 
3 H 2 (t) 


Pi(t) 


a,= 


8 7 tG 



(15.44) 


where the label i denotes ‘m’, ‘r’ or ‘A’ and the dots denote differentiation with 
respect to cosmic time t. From the equation of motion (14.39) for a cosmological 
fluid, however, we have 


Pi = — 3(1 + Wi)Hp h 


where we have written H = R/R, and w t = p j /{p i c 2 ) is the equation-of-state 
parameter. Thus (15.44) becomes 


fl, = -a,// 


3(1 + wf) + 


2 H 


(15.45) 


where we have taken a factor of H outside the brackets for later convenience. We 
now need an expression for H, which is given by 

/ . \ „ / • \ 2 




dt \ R 


and so we may write 


H _RR 


-{< 1 + 1 )* 


where q is the deceleration parameter. Substituting this result into (15.45) and 
using the expression (15.14) for q, we finally obtain the neat relation 


a, = cijH(CL m +2a r - 2a A -1 - 3 w t ). 

Setting Wj = 0, A and —1 respectively for matter (dust), radiation and the vacuum, 
we thus obtain 

a m = — i) + 2a r — 2a A ], 

a r = a r //[a m +2(a r -1) -2a A ], 
a A = a A //[a m +2a r - 2(a A -1)]. 


(15.46) 
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By dividing these equations by one another, we may remove the dependence on 
the Hubble parameter H and the cosmic time t and hence obtain a set of coupled 
first-order differential equations in the variables O m , ti r and alone. Therefore, 
given some general point in this parameter space, these equations define a unique 
trajectory that passes through this point. As an illustration, let us consider the 
case in which fl r = 0. Dividing the remaining two equations then gives 

A _ — 2(H A — 1)] 

dCl m H m [(O m — 1) — 2fi A ] 

which defines a set of trajectories (or ‘flow lines’) in the (<i m , 0 A )-plane. This 
equation also highlights the significance of the points (1,0) and (0,1) in this plane, 
which act as ‘attractors’ for the trajectories. This is illustrated in Figure 15.10, 
which shows a set of trajectories for various cosmological models. Since any 
general point in the plane defines a unique trajectory passing through that point, it 
is convenient to specify each trajectory by the present-day values 0 and li A () 
(although one could equally well use the values at any other cosmic time). In 
the left-hand panel, we plot trajectories passing through f) ni0 = 0.3 and fl A0 = 
0.1,0.2,..., 1.1, and in the right-hand panel the trajectories pass through O a 0 = 
0.7 and fl m 0 = 0.1,0.2,..., 1.1. We see that the trajectories all start at (1,0), 
which is an unstable fixed point, and converge on (0,1), which is a stable fixed 
point. 








Figure 15.10 Evolution of the density parameters fl m and ti A for various 
cosmological models passing through the points H m0 = 0.3 and H A0 = 

0.1,0.2,..., 1.1 (left-hand panel) and H A 0 = 0.7 and 0 = 0.1,0.2,_1.1 

(right-hand panel). 
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It is worth noting the profound effect of a non-zero cosmological constant on 
the evolution of the density parameters. In the case A = 0, any slight deviation 
from fl m = 1 in the early universe results in a rapid evolution away from the point 
(1,0) along the O m -axis, tending to (0, 0) for an open universe and to (oo, 0) for 
a closed one. If A > 0, however, the trajectory is ‘refocussed’ and tends to the 
spatially flat de Sitter case (0,1). Indeed, for a wide range of initial conditions, 
by the time the matter density has reached M m ~ 0.3 the universe is close to 
spatially flat. 


15.11 Evolution of the spatial curvature 

We may investigate directly the behaviour of the spatial curvature as the universe 
expands by determining the evolution of the curvature density parameter 

n* = i-n m -n r -n A = --^. (15.47) 

Differentiating the final expression on the right-hand side with respect to cosmic 
time, or combining the derivatives (15.46), one quickly finds that 


n k = m k H q = n k H(n m +2a r - m A ), 


(15.48) 


where q is the deceleration parameter. We observe that if 11 A = 0 then the quantity 
in parentheses is always positive. Thus, in this case, if Cl k differs slightly from 
zero at some early cosmic time then the spatial curvature rapidly evolves away 
from the spatially flat case. In particular', fl k —> 1 in the open case and fl* —»■ — oo 
in the closed case. The presence of a positive cosmological constant, however, 
changes this behaviour completely. In this case, at some finite cosmic time the 
212 A term in (15.48) will dominate the matter and radiation terms, with the result 
that £l k is ‘refocussed’ back to fl k = 0. 

We may in fact obtain an analytic expression for the spatial curvature as a 
function of redshift z, in terms of the present-day values of the density parameters. 
Substituting for c 2 k from (15.47) evaluated at t — t Q , and noting that R 0 /R = 1 +z, 
we obtain the useful general formula 


H*(z) 


go(l + z) 

. H(z) 


ho¬ 


using our expression (15.38) for H(z) then gives 


n k (z) 


_ Q &,0 _ 

^m,o(l + z) + 4\,o(l + Z ) 2 + ^A,o(l + Z)~ 2 + ^k.O 
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In particular, we see that (apart from models with only vacuum energy), even if 
the present-day value Ll k 0 differs greatly from zero, at very high redshift (i.e. 
in the distant past) must have differed by only a tiny amount from zero. 

Since today we measure the value fl k 0 to be (conservatively) in the range —0.5 
to 0.5, this means that at very early epochs 1 1 k must have been very finely tuned 
to near zero. This tuning of the initial conditions of the expansion is called the 
flatness problem and has no solution within standard cosmological models. From 
our above discussion, however, the presence of a positive cosmological constant 
goes some way to explaining why the universe is close to spatially flat at the 
present epoch. 


15.12 The particle horizon, event horizon and Hubble distance 

Thus far, we have considered the evolution of the entire spatial part of the 
FRW geometry. It is, however, interesting to consider the extent of the region 
‘accessible’ (via light signals) to some comoving observer at a given cosmic 
time t. 


Particle horizon 

Let us consider a comoving observer O situated (without loss of generality) at 
X = 0. Suppose further that a second comoving observer E has coordinate X\ and 
emits a photon at cosmic time t x , which reaches O at time t. Assuming light to 
be the fastest possible signal, the only signals emitted at time t l that O receives 
by the time t are from radial coordinates X < X\- 

The comoving coordinate Xi of the emitter E is determined by 

r 1 dl 

X ' = C IWY (1549) 

If the integral on the right-hand side diverges as fj —»■ 0 then Xi can be made as 
large as we please by taking t l sufficiently small. Thus, in this case, in principle 
it is possible to receive signals emitted at sufficiently early epochs from any 
comoving particle (such as a typical galaxy). If, however, the integral converges 
as ti —»■ 0 then Xi can never exceed a certain value for a given t. In this case our 
vision of the universe is limited by a particle horizon. At any given cosmic time 
t, the ^-coordinate of the particle horizon is given by 


*<'>=<!> 


r R (0 dR 
'o RR 


(15.50) 


where in the second equality we have rewritten the expression as an integral over 
R. The corresponding proper distance to the particle horizon is d p (t) = R(t)x p (t). 
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We see that expression (15.50) will be finite if RR ~ R a with a < 1, which is 
equivalent to the condition R < 0. Hence, any universe for which the expansion 
has been continually decelerating up to the cosmic time t will have a finite particle 
horizon at that time. Clearly, this includes all the Friedmann models that we 
discussed earlier, but particle horizons also occur in other cosmological models, 
for example in the spatially flat Lemaitre model with fl m 0 ~ 0.3 and O A0 ~ 0.7 
that seems to provide a reasonable description of our universe. 

On differentiating (15.50) with respect to t, we have d\ v tdt = c/R(t), which is 
always greater than zero. Thus, the particle horizon of a comoving observer grows 
as the cosmic time t increases, and so parts of the universe that were not in view 
previously must gradually come into view. This does not mean, however, that a 
galaxy that was not visible at one instant suddenly appeal's in the sky a moment 
later! To understand this, we note that if the universe has a big-bang origin then 
we have R(t l ) —> 0 as t l —»■ 0, and so z —> oo. Thus, the particle horizon at any 
given cosmic time is the surface of infinite redshift , beyond which we cannot see. 
If the particle horizon grew to encompass a galaxy, the galaxy would therefore 
appeal' at first with an infinite redshift, which would gradually reduce as more 
cosmic time passed. Hence the galaxy would not simply ‘pop’ into view. 7 

In fact, we can obtain explicit expressions for the particle horizon in some 
cosmological models. For example, a matter-dominated model at early epochs 
obeys R(t)/R 0 = it/ 1 {) ) 2 ^, whereas a radiation-dominated model at early epochs 
obeys R(t)/R 0 = Substituting these expressions into (15.50) gives the 

proper distance to the particle horizon at cosmic time t as 

d p (t) = 3ct (matter-dominated), c/ p ( t) = let (radiation-dominated). 

These proper distances are larger than cl because the universe has expanded while 
the photon has been travelling. Alternatively, if one has an analytic expression 
for x(z) for some cosmological model then the corresponding expression for % p 
may be obtained simply by letting z —»■ oo. 

The existence of particle horizons for the common cosmological models illus¬ 
trates the horizon problem , i.e. how do vastly separated regions display the 
same physical characteristics (e.g. the nearly uniform temperature of the cosmic 
microwave background) when, according to standard cosmological models, these 
regions could never have been in causal contact? This problem, like the flatness 
problem, is a serious challenge to standard cosmology that can only be resolved 
by invoking the theory of inflation (see Chapter 16). 


7 In practice, our view of the universe is not limited by our particle horizon but by the epoch of recombination, 
which occurred at z rec ^ 1500 (long before the formation of any galaxies). Prior to this epoch, the universe 
was ionised and photons were frequently scattered by the free electrons, whereas after this point electrons and 
protons (and neutrons) combined to form atoms and the photons were able to propagate freely. This surface 
of last scattering is therefore the effective limit of our observable universe. 
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The horizon problem can be illustrated by a simple example. Consider a galaxy 
at a proper distance of 10 9 light years away from us. Since the age of the universe 
is ~ 1.5 x 10 10 years, there has been sufficient time to exchange about 15 light 
signals with the galaxy. At earlier times, when the scale factor R was smaller, 
everything was closer together and so we might have naively expected that this 
would improve causal contact. In a continuously decelerating universe, however, 
it makes the problem worse. At, for example, the epoch of recombination (when 
the cosmic microwave background photons were emitted) the redshift z was 
approximately 1000, so R(t iec )/R 0 ~ 10” 3 and the proper distance to the ‘galaxy’ 
is 10 6 light years. 3 If we assume, for simplicity, that after f rec the expansion 
followed a matter-dominated Einstein-de-Sitter universe, then 



and so t iec = 1.5 x 10 5 ' 5 years. However, assuming that prior to f rec the expansion 
followed a radiation-dominated Einstein-de-Sitter model, the proper distance to 
the particle (causal) horizon is 2cf rec = 3 x 10 5 5 light years. Thus, by t iec ‘we’ 
could not have exchanged even one light signal with the other ‘galaxy’. 

Event horizon 

Although our particle horizon grows as the cosmic time t increases, in some 
cosmological models there could be events that we may never see (or, conversely, 
never influence). Returning to our expression (15.49), we see that if the integral 
on the right-hand side diverges as / — oc (or the time at which R equals zero 
again), then it will be possible to receive light signals from any event. However, 
if the integral instead converges for large t then, for light signals emitted at /,, we 
will only ever receive those from events for which the ^-coordinate is less than 



where t max is either infinity or the time of the big crunch (i.e. R(t max ) = 0). This 
is called the event horizon. By symmetry, Te( ? o) i s the maximum ^-coordinate 
that can be reached by a light signal sent by us today. 

Hubble distance 

From our discussion in Section 15.7, the elapsed cosmic time t since the big 
bang is, in general, of the order H~ l (t), which is known as the Hubble time and 

8 In reality the galaxy would not yet have formed, but this does not affect the main point of the argument. 
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provides a characteristic time scale for the expansion of the universe. In a similar 
way, at a cosmic time t one can define the Hubble distance 

d H (t) = cH~ x {t), 


which provides a characteristic length scale for the universe. We may also define 
the comoving Hubble distance 


d n (t) c c 

= R(t ) = H(t)R(t) = Kt)' 


(15.51) 


where in the last equality we have used the fact that H = R/R. The above 
expression simply gives the ^-coordinate corresponding to the Hubble distance. 

The Hubble distance d H (t) corresponds to the typical length scale (at cosmic 
time t ) over which physical processes in the universe operate coherently. It is also 
the length scale at which general-relativistic effects become important; indeed, 
on length scales much less than r/ H (t). Newtonian theory is often sufficient to 
describe the effects of gravitation. From our discussion above, we further note 
that the proper distance to the particle horizon for standard cosmological models 
is typically 

d p (t) ~ ct ~ cH~ l (t). 


Thus, we see that the particle horizon in such cases is of the same order as the 
Hubble distance. As a result, the Hubble distance is often described simply as the 
‘horizon’. It should be noted, however, that the particle horizon and the Hubble 
distance arc distinct quantities, which may differ by many orders of magnitude in 
inflationary cosmologies, which we discuss in the next chapter. In particular, we 
note that the particle horizon at time t depends on the entire expansion history of 
the universe to that point, whereas the Hubble distance is defined instantaneously 
at t. Moreover, once an object lies within an observer’s particle horizon it remains 
so. On the contrary, an object can be within an observer’s Hubble distance at one 
time, lie outside it at some later time and even come back within it at a still later 
epoch. 


Exercises 

15.1 For blackbody radiation, the number density of photons with frequencies in the 
range [v, v + dv] is given by 

n(v, T)dv= c 3(jlX-i) dv ’ (E 15 - 1 ) 

where T is the ‘temperature’ of the radiation. By conserving the total number 
of photons, show that the photon energy distribution of the cosmic microwave 
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background (CMB) radiation retains its general blackbody form as the universe 
expands. Show further that the total number density n of photons is 


n(T) = 0.244 


2r7k B r\ 3 

he ) 


Hence show that the present-day number density of CMB photons in the universe is 
n 0 ~ 4 x 10 8 m -3 , and compare this with the present-day number density of protons. 
How does this ratio vary with cosmic time? 

x 2 

Hint: - dx — 0.244ir 2 . 

15.2 Suppose that the present-day energy densities of radiation and matter (in the form 
of dust) are p r (f 0 )c 2 and p m (t 0 )c 2 respectively. Show that the energy densities of 
the two components were equal at a redshift ~ eq given by 


1 4“ £eq 


PmOo) 
PrOo) ’ 


What assumptions underlie this result? Hence show that 


1 “h ^eq 


3 c\pg 

^ttGuTq ’ 


where a is the reduced Stefan-Boltzmann constant and T 0 is the present-day temper¬ 
ature of the cosmic microwave background. Show that for our universe z eq ~ 5000. 
What was the temperature of the CMB radiation at this epoch? 

15.3 Show that in the early, radiation-dominated, phase of the universe, the temperature 
T of the radiation satisfies the equation 


/t\ 2 8i7G«r 4 

w 


where the dot denotes differentiation with respect to the cosmic time t and a is the 
reduced Stefan-Boltzmann constant. Hence show that 


and that the cosmic time at matter-radiation equality is f eq & 16000 years. 

15.4 The CMB radiation was emitted at the epoch of recombination at redshift ~ rec ~ 
1500. Show that f rec 450000 years. 

15.5 Consider a cylindrical piston chamber of cross-sectional area A ‘filled’ with vacuum 
energy. The piston is withdrawn a linear distance dx. Show that the energy created 
by withdrawing the piston equals the work done by the vacuum, provided that 

Pvac = -Pv acC 2 - 


Hence show that, in this case, the vacuum energy density is constant as the piston 
is withdrawn. 
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15.6 Show that the present-day value of the scale factor of the universe may be written as 


Rl> ~ H, t \ a 


What value does R 0 take in a spatially flat universe? 

15.7 Show that, for our universe to be spatially flat, the total density must be equivalent 
to ~ 5 protons m -3 . 

15.8 In the Newtonian cosmological model discussed in Exercise 14.14, show that the 
total energy E of the test particle of mass m can be written as 

E = \m{\-VL m )R 2 H 2 , 
and interpret this result physically. 

15.9 Show that at all cosmic times the density parameters obey the relation 

n m + n r + n A + n* = i. 

15.10 In terms of the dimensionless density parameters, show that the two cosmological 
field equations can be written in the forms 


H 2 = H 2 [fl m 0 a -3 + fl r () a 

q = |(n m +2n r -2a A ), 


11, n H,. n £7 


where H and q are the Hubble and deceleration parameters respectively, and 
a — R/R 0 is the normalised scale factor. 

15.11 The conformal time variable is defined by dr)— cdt/R. Hence show that the 
second cosmological field equation can be written as 


~T~ ) — “ 7:-(^m.O a + ^r.O + ^A,0 a + ^\o' 7 ")' 


15.12 Show that the density parameter for matter, radiation or the vacuum varies with 
the normalised scale factor as 

n, = n, 0 a- 3 ^, 

where w : is the appropriate equation-of-state parameter. 

15.13 Show that the condition for the «(f)-curve to have a turning point is 

/(«) = ^A.OO 3 + (1 “ ft m.O - ^A. 0 )« + ^m,0 = 0 - 

In the case H A 0 > 0, show by evaluating the derivatives f'(a) and f"(a ) that the 
condition for f(a) to have a single positive root at a — a t is /(fl*) = f'(af) = 0. 
Show further that this root occurs at 


1/3 
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Hence show that the values fi m 0 and [l A 0 , along any dividing line in this plane 
that separates those models with a turning point in the «(f)-curve from those 
without, must satisfy 


4(1 - ci m0 - n A0 y + 27n^ 0 n A o = 0. 


15.14 Show that the substitution x — [fl A 0 /(4fl m 0 )] 1//3 reduces the final cubic equation 
in Exercise 15.13 to 


3x fi m o — 1 
~4 + 4H m0 


= 0 . 


By using the standard formulae for the roots of a cubic, or otherwise, verify the 
results (15.18-15.20). 

15.15 Show that, in terms of the variable t — H 0 (t — t 0 ), the evolution of the normalised 
scale factor obeys the equation 


da\ 

— pr ) — 0 a 0 fl " + 1 0 ~ 0 _ 0- 

dt / 


Show that, when one is integrating this equation numerically, an iterative algorithm 
of the form 

( da\ 

would not be able to propagate the solution through points for which da/dt = 0. 

15.16 For a k = — 1 Friedmann model containing no matter or radiation, show that the 
line element becomes 


ds 2 — c 2 dt 2 — c 2 t 2 [dx 2 + sinh 2 x (dd 2 + sin 2 Odcj) 2 )]. 


Show that this metric describes a Minkowski spacetime. 

15.17 For a dust-only Friedmann model with li m 0 > 1, show that 


a. 


(1 — COS (/f). 




2(a m , 0 -l) v 2H 0 ({l m0 — l) 3 / 2 

Hence show that the a(f)-curve has a maximum at 


(</f — sin < ji). 


^m,0 


^m,0 

77 n m ,o 

,0 — 1 

tmm ~ 2H 0 (n m 0 -1) 3 /2 ’ 

a universe is given by 

, 1 

'2 \ 2 

COS 1 



0 2H 0 (fl m0 — l) 3 / 2 

15.18 For the Einstein-de-Sitter model, prove the following useful results: 


2 

3 


a (t) = { - 


H(t) = - = H 0 (l + z) 3/2 , 


%'■ 


Pm(0 = 


1 


6irGt 2 
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15.19 For a radiation-only Friedmann model with 0 1, show that 

( \ 1/2 
l+X ~^w H A ■ 

Hence, for n r0 > 1, show that the a(r)-curve has a maximum at 

( ftr.O 1 <0 

= - , f v — -, 

V«r.0-1/ H 0 (Cl tfi -iy 

and that the age t 0 of such a universe is given by 

1 1 1 

'““floll^+l < 2H 0 ' 

15.20 For the spatially flat, radiation-only, Friedmann model, prove the following useful 
results: 

/1 \ 1/2 1 3 

. ff0) = s = «„(i+ z ), «, = 1. P,( 0 = 3^- 

15.21 For a spatially flat Friedmann model containing both matter and radiation, show 
that 

Hot = [(fVofl + n r . 0 ) 1/2 (a m , 0 fl - 20 r 0 ) + 2n r v 0 2 ] . 

15.22 For a Lemaitre model containing no radiation, show that at the point of inflection 
of the fl(f)-curve the value of the normalised scale factor is 



and calculate a t for our universe. Show further that, in the vicinity of the point of 
inflection, the scale factor obeys the equation 

a~ ss H 0 + 3H a 0 a t + 3I2 A 0 (fl — a*) ] 


and that this has the solution 


a(0 = «* + a*[l + |n jtj0 (i fl A,o fi m,o) V3 ] 7 sinh [// 0 (3H A 0 ) 1/2 (/ — r,)]. 

15.23 For a spatially flat Lemaitre model containing no radiation, show that 




/ fl3 |^A,ol/(l — ^A,o) dy 

yr ±7 


Hence show that 


/ l — ^a,o \ 1/3 |sinh 2/3 (iyn^H 0 t) if fl A ,o > 0, 

V |n A .ol ) jsin 2 / 3 (| 4 O^// 0 r) iffl A ,o<0. 
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15.24 Show that, in general. 


R , 
-=H 2 
R 


-H. 


Hence use the cosmological field equations to show that, for a spatially flat 
Lemaitre model containing no radiation, the Hubble parameter and the matter 
density satisfy the equations 

2H + 3H 2 = Ac 2 , 

3 H 2 — Ac 2 — 87 rGp m . 

Assuming A > 0 and requiring p m > 0, thus show that 

3 [Ac 2 


H(t ) = J -coth I -J - 1 


, , Ac2 +2 / 3 

p (r) = -coseclr I - 

FmV ’ 877G \ 2 

and therefore find expressions for H m (f) and fl A (?). Show further that 

2 tanh -1 A /H A 

' = 3 H 

Hint: f a 2 /(a 2 — x 2 ) dx — coth -1 (x/a) + constant, for x 2 > a 2 . 

15.25 Show that for a physically reasonable perfect fluid (i.e. density > 0 and pressure 
> 0) there is no static isotropic homogeneous solution to Einstein’s equations with 
A = 0. Show that it is possible to obtain a static zero-pressure solution by the 
introduction of a cosmological constant A such that 

, c 2 k 

Ac 2 = 4vGp m0 = — T . 

*0 

Show that this solution is unstable, however. 

15.26 Show that the comoving ^-coordinate of a galaxy emitting a photon at time t that 
is received at t 0 is given by 

c r 1 da 

X=~d~ —■ 

R 0 ■qi+z)- 1 aa 

Using the cosmological field equation (15.13) to substitute for a, show that 

dx 


*(z) = 


f , 

7(1+7)-! 


R 0 H 0 7 (i + 2 )-i ^d m 0 x + H r 0 + H A 0 x 4 + Cl k 0 x 2 


15.27 For a dust-only Friedmann model, show that the luminosity-distance relation varies 
with redshift as 

2c 


4 (z) = 


II.Al- 


.Z + («m,0 - 2 ) (7«m.0Z+l - l)' 


This result is known as Mattig ’s formula. 
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15.28 For a Friedmann model dominated by a single source of energy density, show that 


ar‘(z)-i 


(1 + z) 1+3u,i ’ 


where w l is the equation-of-state parameter of the source. Use this result to 
comment on the flatness problem. 

15.29 For a general cosmological model, show that 


n f = + 2a r - 2a A - i - 3 w,). 


where i denotes matter, radiation or the vacuum. 

15.30 By differentiating the definition D, k = —kc 2 /(R 2 H : ), show that 

ci k = 2 n k H q = a k H(ci m + 2 a r - 2a A ), 


where q is the deceleration parameter. 

15.31 Show that the particle horizon at cosmic time t is given 

c dx 

R 0 H 0 Jo yj a^x+a r 0 + a A>0 x 4 + n k 0 x 2 

15.32 Consider the cosmological line element 

ds 2 = c 2 dt 2 — e 2,,b {dr 2 + r 2 d0 2 + r 2 sin 2 ddcj) 1 ). 


Light signals from a galaxy at coordinate distance r are emitted at epoch t ] and 
received by an observer at epoch t 0 . Show that 

L = e ~hlb_ e -t 0 /b 
be 

For a given r, show that there is a maximum epoch t 1 and interpret this result 
physically. Show that a light ray emitted by the observer asymptotically approaches 
the coordinate r — be but never reaches it. 
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In the last two sections of the previous chapter, we saw that standard cosmological 
models suffer, in particular', from the flatness problem and the horizon problem. 
To these problems, one might also add the ‘expansion problem’, which asks 
simply why the universe is expanding at all. Although this appeal's as an initial 
condition in cosmological models, one would hope to explain this phenomenon 
with an underlying physical mechanism. In this chapter, we therefore augment 
our discussion of cosmological models with a brief outline of the inflationary 
scenario, which seeks to solve these problems (and others) and has, over the 
past two decades, become a fundamental part of modern cosmological theory. 1 In 
particular, we will discuss the effect of inflation on the evolution of the universe 
as a whole and also consider how inflation gives rise to perturbations in the 
early universe that subsequently collapse under gravity to form all the structure 
we observe in the universe today. Given the general algebraic complexity of 
these topics (particularly the perturbation analysis), we will adopt the convention 
throughout this chapter that 

8ttG = c= 1. 

This choice of units makes many of the equations far less cluttered and amounts 
only to a rescaling of the scalar field and its potential (see below), which we can 
remove at the end if desired. 


16.1 Definition of inflation 

As noted in Section 15.12, the horizon problem is a direct consequence of the 
deceleration in the expansion of the universe. Thus, a possible solution is to 


1 For a detailed discussion of inflationary cosmology, see, for example, A. Liddle & D. Lyth, Cosmological 
Inflation and Large-scale Structure, Cambridge University Press, 2000. 
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postulate an accelerating phase of expansion, prior to any decelerating phase. In 
an accelerating phase, causal contact is better at earlier times and so remotely 
separated parts of our present universe could have ‘coordinated’ their physical 
characteristics in the early universe. Such an accelerating phase is called a period 
of inflation. Hence the basic definition of inflation is that 

R> 0. (16.1) 

In fact, we may recast this condition in an alternative manner that is physically 
more meaningful by considering the comoving Hubble distance defined in (15.51), 
namely / y H (t) = H~ l {t)/R{t). The derivative with respect to t is given by 

d d n\_ R 

dt V R ) ~ dt “ R- , 

and so the condition (16.1) can be written as 



Thus, an equivalent condition for inflation is that the comoving Hubble distance 
decreases with cosmic time. Hence, when viewed in comoving coordinates, the 
characteristic length scale of the universe becomes smaller as inflation proceeds. 

Let us suppose that, at some period in the early universe, the energy density 
is dominated by some form of matter with density p and pressure p. The first 
cosmological field equation (14.36) (with A = 0 and SttG = c = 1) then reads 

R = ~l(p + 3p)R. (16.2) 

Thus, we see that in order for the universe to accelerate, i.e. for inflation to occur, 
we require that 

(16.3) 

In other words, we need the ‘matter’ to have an equation of state with negative 
pressure. In fact, the above criterion can also solve the flatness problem. The 
second cosmological field equation (with A = 0 and 8ttG = c = 1) reads 

R 2 = i pR 2 -k. 



(16.4) 
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During a period of acceleration (R > 0). the scale factor must increase faster than 
R(t ) oc t. Provided that p < — \p, the quantity pR 2 will increase during such a 
period as the universe expands and can make the curvature term on the right- 
hand side negligible, provided that the accelerating phase persists for sufficiently 
long. 

We could, of course, have included the cosmological-constant terms in the 
two field equations, which would then be equivalent to those for a fluid with 
an equation of state p = —p and so would clearly satisfy the criterion (16.3). 
However, we have chosen to omit such terms since, as we will see, if ‘matter’ 
in the form of a scalar field exists in the early universe then this can act as an 
effective cosmological constant. In order to show that the existence of such fields 
is likely, we must consider briefly the topic of phase transitions in the very early 
universe. 


16.2 Scalar fields and phase transitions in the very early universe 

The basic physical mechanism for producing a period of inflation in the very 
early universe relies on the existence, at such epochs, of matter in a form that can 
be described classically in terms of a scalar field (as opposed to a vector, tensor 
or spinor field, examples of which arc provided by the electromagnetic field, the 
gravitational field and normal baryonic matter respectively). Upon quantisation, 
a scalar field describes a collection of spinless particles. 

It may at first seem rather arbitrary to postulate the presence of such scalar 
fields in the very early universe. Nevertheless, their existence is suggested by our 
best theories for the fundamental interactions in Nature, which predict that the 
universe experienced a succession of phase transitions in its early stages as it 
expanded and cooled. For the purposes of illustration, let us model this expansion 
by assuming that the universe followed a standard radiation-dominated Friedmann 
model in its early stages, in which case 

R(t) oc t 1/2 oc (16.5) 

where the ‘temperature’ T is related to the typical particle energy by T ~ E/k B . 
The basic scenario is as follows. 

• Zip ~ 10 19 GeV > E > E gut ~ 10 15 GeV The earliest point at which the universe can 
be modelled (even approximately) as a classical system is the Planck era, corresponding 
to particle energies E P ~ 10 19 GeV (or temperature 7 P ~ 10 32 K) and time scales t P ~ 
1CT 43 s (prior to this epoch, it is considered that the universe can be described only 
in terms of some, as yet unknown, quantum theory of gravity). At these extremely 
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high energies, grand unified theories (GUTs) predict that the electroweak and strong 
forces are in fact unified into a single force and that these interactions bring the 
particles present into thermal equilibrium. Once the universe has cooled to £ GUT ~ 
10 14 GeV (corresponding to 7c,UT ~ 10 27 K), there is a spontaneous breaking of the 
larger symmetry group characterising the GUT into a product of smaller symmetry 
groups, and the electroweak and strong forces separate. From (16.5), this GUT phase 
transition occurs at r GUT ~ 10 -36 s. 

• 7?gut ~ 10 15 GeV > E > £ EW ~ 100 GeV During this period (which is extremely 
long in logarithmic terms), the electroweak and strong forces are separate and these 
interactions sustain thermal equilibrium. This continues until the universe has cooled 
to £ ew ~ 100 GeV (corresponding to 7 GUX ~ 10 15 K), when the unified electroweak 
theory predicts that a second phase transition should occur in which the electromagnetic 
and weak forces separate. From (16.5), this electroweak phase transition occurs at 

? EW ~ 10 11 S. 

• E ew ~ 100 GeV > E > Eq U ~ lOOMeV During this period the electromagnetic, weak 
and strong forces are separate, as they are today. It is worth noting, however, that 
when the universe has cooled to £ QH ~ lOOMeV (corresponding to r QH ~ 10 12 K) 
there is a final phase transition, according to the theory of quantum chromody¬ 
namics, in which the strong force increases in strength and leads to the confine¬ 
ment of quarks into hadrons. From (16.5), this quark-hadron phase transition occurs 
at t QH ~ 10 -5 s. 

In general, phase transitions occur via a process called spontaneous symmetry 
breaking, which can be characterised by the acquisition of certain non-zero values 
by scalar parameters known as Higgs fields. The symmetry is manifest when 
the Fliggs fields have the value zero; it is spontaneously broken whenever at 
least one of the Higgs fields becomes non-zero. Thus, the occurrence of phase 
transitions in the very early universe suggests the existence of scalar fields and 
hence provides the motivation for considering their effect on the expansion of the 
universe. In the context of inflation, we will confine our attention to scalar fields 
present at, or before, the GUT phase transition (the most speculative of these phase 
transitions). 


16.3 A scalar field as a cosmological fluid 

For simplicity, let us consider a single scalar field 0 present in the very early 
universe. The field cj> is traditionally called the ‘inflaton ’field for reasons that will 
become apparent shortly. The Lagrangian for a scalar field <:/) (see Section 19.6) 
has the usual form of a kinetic term minus a potential term: 
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The corresponding field equation for <b is obtained from the Euler-Lagrange 
equations and reads 

0 dV 

□ 2 0 + -tt = O, (16.6) 

dtp 

where D 2 = V M is the covariant d’Alembertian operator. A simple 
example is a free relativistic scalar field of mass m, for which the potential would 
be V(<p) = \m 2 (b 2 and the field equation becomes the covariant Klein-Gordon 
equation, 

\3 2 (j) + m 2 (f) = 0. 


For the moment, however, it is best to keep the potential function V((b ) general. 

The energy-momentum tensor T jJV for a scalar field can be derived from 
this variational approach (see Section 19.12), but in fact we can use our earlier 
experience to anticipate its form. By analogy with the forms of the energy- 
momentum tensor for dust and for electromagnetic radiation, we require that T l±l , 
is (i) symmetric and (ii) quadratic in the derivatives of the dynamical variable c/>, 
and (iii) that V u 7 ,/iI ' = 0 by virtue of the field equation (16.6). It is straightforward 
to show that the required form must be 

T» v = {d^){d v <f)- 8llv [1(^(0) - V(4)] - (16.7) 

The energy-momentum tensor for a perfect fluid is 

T nv = (fi + rfu^Uv- pg^, 

and by comparing the two forms in a Cartesian inertial coordinate system (g llt , = 
p /J(; ) in which the fluid is at rest, we see that the scalar field acts like a perfect 
fluid, with an energy density and pressure given by 


P</> — + H0) + y(V0) 2 , 

P4> = 


In particular, we note that if the field (b were both temporally and spatially 
constant, its equation of state would be p^ and so the scalar field would 

act as a cosmological constant with A = V(<p) (with 877 G = c = 1). In general this 
is not the case, but we will assume that the spatial derivatives can be neglected. 
This is equivalent to assuming that f> is a function only of t and so has no spatial 
variation. 
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Let us suppose that the scalar field does not interact (except gravitationally) with 
any other matter or radiation that may be present. In this case, the scalar field 
will independently obey an equation of motion of the form (14.39), namely 

p + 3{p + p)- = 0. 

Substituting the expressions (16.8) and assuming no spatial variations, we quickly 
find that the equation of motion of the scalar field is 

(16.9) 

The form of this equation will be familial - to any student of classical mechanics 
and allows one to develop an intuitive picture of the evolution of the scalar field. 
If one thinks of the plot of the potential V versus 4> as defining some curve, then 
the motion of the scalar field value <:/) is identical to that of a ball rolling (or, 
more precisely, sliding) under gravity along the curve, subject to a frictional force 
proportional to its speed (and to the value of the Hubble parameter). 

Let us assume further that there is some period when the scalar field dominates 
the energy density of the universe. Moreover we will demand that the scalar field 
energy density is sufficient large that we may neglect the curvature term in the 
cosmological field equation (16.4) although this is not strictly necessary. 2 Thus, 
we may write (16.4) as 

(16.10) 

This equation and (16.9) thus provide a set of coupled differential equations in 
(j> and H that determine completely the evolution of the scalar field and the 
scale factor of the universe during the epoch of scalar-field domination. From our 
criterion (16.3) and the expressions (16.8), we see that inflation will occur (i.e. 
R > 0) provided that 

(16.11) 





Note that, even if the curvature term is not negligible to begin with, the initial stages of inflation will soon 
render it so. 
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16.5 The slow-roll approximation 


The inflation equations (16.9) and (16.10) can easily be solved numerically, and 
even analytically for some special choices of V{(f>). In general, however, an 
analytical solution is only possible in the slow-roll approximation , in which it is 
assumed that (jr V(<p). On differentiating, this in turn implies that <6 dV/d(j> 
and so the (6-term can be neglected in the equation of motion (16.9), to yield 


dV 


(16.12) 


Moreover, the cosmological field equation (16.10) becomes simply 


H 2 = \V{cf>). 


(16.13) 


It is worth noting that, in this approximation, the rate of change of the Hubble 
parameter and the scalar field can be related very easily. Differentiating (16.13) 
with respect to t and combining the result with (16.12), one obtains 


H=- 


w-. 


(16.14) 


The conditions for inflation in the slow-roll approximation can be put into 
a useful dimensionless form. Using the two equations above and the condition 
4> 2 d V(4>), it is easy to show that 

1 bv'\ 2 

2(7) «'• < 1615 > 

where V' = dV/d<f> and the factor ( is included according to the standard conven¬ 
tion. Differentiating the above expression with respect to <fi, one also finds that 

V" 

— (16.16) 

These two conditions make good physical sense in that they require the potential 
V(4>) to be sufficiently ‘flat’ that the field <p ‘rolls’ slowly enough for inflation to 
occur. It is worth noting, however, that these conditions alone arc necessary but 
not sufficient conditions for inflation, since they limit only the form of V(<p) and 
not that of cf, which could be chosen to violate the condition (16.11). Thus, one 
must also assume that (16.11) holds. 

It is worth considering the special case in which the potential V(4>) is sufficiently 
flat that, during (some paid of) the period of inflation, its value remains roughly 
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constant. From (16.12), we see that in this case the Hubble parameter is constant 
and the scale factor grows exponentially. 



16.6 Ending inflation 

As the field value cj> ‘rolls’ down the potential V(4>), the condition (16.11) will 
eventually no longer hold and inflation will cease. Equivalently, in the slow- 
roll approximation, the conditions (16.15, 16.16) will eventually no longer be 
satisfied. If the potential V((b ) possesses a local minimum, which is usually the 
case in most inflationary models, the field will no longer roll slowly downhill but 
will oscillate about the minimum of the potential, the oscillation being gradually 
damped by the 3 Hf> friction term in the equation of motion (16.9). Eventually, 
the scalar field is left stationary at the bottom of the potential. If the value of 
the potential at its minimum is V / min > 0 then clearly the condition (16.11) is 
again satisfied and the universe continues to inflate indefinitely. Moreover, in this 
case p ^ = — p,/, and so the scalar field acts as an effective cosmological constant 
A = V min . If V min = 0, however, no further inflation occurs, the scalar field has 
zero energy density and the dynamics of the universe is dominated by any other 
fields present. 

In fact, the scenario outlined above would occur only if the scalar field were 
not coupled to any other fields, which is almost certainly not the case. In practice, 
such couplings will cause the scalar field to decay during the oscillatory phase 
into pairs of elementary particles, into which the energy of the scalar field is thus 
converted. The universe will therefore contain roughly the same energy density 
as it did at the start of inflation. The process of decay of the scalar field into 
other particles is therefore termed reheating. These particles will interact with 
each other and subsequently decay themselves, leaving the universe filled with 
normal matter and radiation in thermal equilibrium and thereby providing the 
initial conditions for a standard cosmological model. 


16.7 The amount of inflation 

Although the motivation for the introduction of the inflationary scenario was (in 
part) to solve the flatness and horizon problems, we have not yet considered the 
amount of inflation required to achieve this goal. From our present understanding 
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of particle physics, it is thought that inflation occurs at around the era of the GUT 
phase transition, or earlier. For illustration, let us assume that the universe has 
followed a standard radiation-dominated Friedmann model for (the majority of) 
its history since the epoch of inflation at t ~ t. f . From (16.5), we thus have 


R * 
Ro 



(16.17) 


where T 0 ~ 3 K is the present-day temperature of the cosmic microwave back¬ 
ground radiation and t f] ~ 1 / H () ~ 10 18 s is the present age of the universe. 

Let us first consider the flatness problem. From (15.47), the ratio of the spatial 
curvature density at the inflationary epoch to that at the present epoch is given by 


^ ( h q\ 2 ( r q \ 2 - U 

\HJ \R*J to 


(16.18) 


where we have used the fact that Hq/H* ~ t*/t 0 . Assuming inflation to occur 
at some time between the Planck era and the GUT phase transition, so that 
i P < U < ? GUT , from Section 16.2 we find that the ratio (16.18) lies in the range 
~ 10” 6 °-10“ 54 . Thus, if the present-day value f> /{0 is of order unity then the 
required degree of fine-tuning of Cl k * is extreme, in a standard cosmological 
model. Since the ratio above depends on l/Rf we thus find that, to solve the 
flatness problem (in order that can also be of order unity), we require the 
scale factor to grow during inflation by a factor ~ 10 27 -10 3 °. In terms of the 
required number N of e-foldings of the scale factor, we thus have 


N > 60-70 (flatness problem). 


We now turn to the horizon problem. If the universe followed a standard 
radiation-dominated Friedmann model in its earliest stages, then (reinstating c for 
the moment) the particle horizon at the inflationary epoch is 

d p * = 2 ct*, 

which, taking t P < t# < touT- gives the size of a causally connected region at this 
time as ~ 10" 33 -10” 27 m. From (16.17), we see that the size of such a region 
today would be only ~ 10 _3 -1 m. The current size of the observable universe, 
however, is given approximately by the present-day Hubble distance, 

d H o = cHq 1 ~ 10 26 m ~ 3000Mpc. 
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To solve the horizon problem, we thus require the scale factor to grow by a factor 
of ~ 10 26 -10 29 during the period of inflation. Expressing this result in terms of 
the required number N of e-foldings, we once again find 

N > 60-70 (horizon problem). 


We have thus found that both the flatness and the horizon problems can be 
solved by a period of inflation, provided that the scale factor undergoes more than 
around 60-70 e-foldings during this period. We may now consider the constraints 
placed by this condition on the form of the scalar field potential V(0). In the 
slow-roll approximation, the number of e-foldings that occur while the scalar field 
‘rolls’ from 0 d to 0 2 is given by 



If the potential is reasonably smooth then V' ~ U/0. Thus, if Ad) = | c/x slarl — 0 end | 
is the range of 0-values over which inflation occurs, one finds N ~ (A0) 2 . In 
order to solve the flatness and horizon problems, one hence requires A0 1. 


16.8 Starting inflation 

The observant reader will have noticed that so far we have not discussed how 
inflation may start. During the inflationary epoch, the scalar field rolls downhill 
from 0 start to 0 en( j, but we have not yet considered how the universe can arrive at 
an appropriate starting state. The details will depend, in fact, on the precise infla¬ 
tionary cosmology under consideration, but there arc generally two main classes 
of model. In early models of inflation, the inflationary epoch is an ‘interlude’ in 
the evolution of a standard cosmological model. In such models, the inflaton field 
0 is usually identified with a scalar Higgs field operating during the GUT phase 
transition. It is thus assumed that the universe was in a state of thermal equi¬ 
librium from the very beginning and that this state was relatively homogeneous 
and large enough to survive until the beginning of inflation at the GUT era; an 
example of this sort is provided by the ‘new ’ inflation model discussed below in 
Section 16.9. In more recent models of inflation, the scalar field 0 is not identified 
with the Higgs field in the GUT phase transition but is some generic scalar field 
present in the very early universe. In particular, in these models the universe may 
inflate soon after it exits the Planck era, thereby avoiding the above assumptions 
regarding the state of the universe prior to the inflationary epoch; an example of 
such a model is the chaotic inflation scenario discussed in Section 16.10. We will 





438 


Inflationary cosmology 


also discuss briefly the natural extension of the chaotic inflation model, called 
stochastic inflation (or eternal inflation ) in Section 16.11. 


16.9 ‘New’ inflation 

In the ‘new’ inflation model, 3 the inflationary epoch occurs when the universe 
goes through the GUT phase transition. As we will see, models of this general type 
typically require a rather special form for the potential V( (b ) in order to produce 
an effective period of inflation. In particular, identifying the inflaton field <fi with 
the scalar - Higgs field operating during the GUT spontaneous-symmetry-breaking 
phase transition, considerations from quantum field theory suggest a form for the 
potential V(cf T) which is actually also a function of temperature T. The typical 
form for V(cf>, T) is shown in Figure 16.1 for several values of T. At very high 
temperatures the potential is parabolic with a minimum at cj> = 0, which is the 
true vacuum state (i.e. the state of lowest energy). Thus at very high temperatures 
we would expect the scalar - field to have the value f = 0. However, for lower 
temperatures the form of the potential changes until at the critical temperature 



Figure 16.1 The temperature-dependent potential function for a Higgs-like 
scalar field <p. 


3 The 'new' inflationary model is so called in order to distinguish it from the original 'old' inflation model of 
Guth, in which the scalar Higgs field executed quantum mechanical tunnelling at T ~ T c , where T c is the 
critical temperature, from the metastable false ground state at (j) = 0 through a potential barrier to the true 
ground state with </> > 0. Although this model provided the genesis for the inflationary idea, it was quickly 
shown to predict a universe very different to the one we observe. In short, the tunnelling process produces 
bubble nucleation and it turns out that these bubbles are too small to be identified with the observable universe 
and are carried apart too quickly by the intervening inflating space for them to coalesce, hence resulting in a 
highly inhomogeneous universe, contrary to observations. 
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T = T c the potential develops a lower energy state than that at 0 = 0. Thus this 
new non-zero value of 0 is now the true vacuum state, and 0 = 0 is now & false 
vacuum state. For even lower values the new true vacuum state becomes more 
pronounced until a final form is reached for ‘low’ temperatures. 

Let us now consider the evolution of the scale factor R(t), the radiation energy 
density p r and the scalar field 0. 

Phase 1 When the temperature is very high, i.e. far above the GUT phase 
transition scale of T c ~ 10 27 K, from Figure 16.1 we would expect the scalar field 
to have the value <b = 0 (i.e. at the true vacuum state for these temperatures), 
and Figure 16.1 shows that it will remain at 0 = 0. Since p r ot R~ 4 , however, we 
would expect the radiation to dominate over the scalar field at very early epochs. 
Thus we have the standard early-time radiation-dominated Friedmann model, in 
which we can neglect the curvature constant k. Thus, for T T c , 

Rcxt 1 / 2 , p r oct~ 2 , 0 = 0. 


Phase 2 It is clear from the above equations that there will come a time when 
the scalar-field energy density dominates over that of the radiation. Provided that 
this occurs for T > T c the scalar field remains at 0 = 0, in which case it acts as an 
effective cosmological constant of value A = F(0). Thus, in this phase, the scale 
factor undergoes an exponential expansion: 



As a result of the exponential expansion, however, there is a corresponding 
exponential decrease in the temperature T, which results in a rapid change of 
the potential function. Thus T ~ T c is reached very quickly, and so this phase 
is extremely short-lived, and very little expansion is actually achieved. Indeed, if 
T ~ T c is reached before the scalar-field energy density dominates over that of 
the radiation then phase 2 does not occur at all. 

Phase 3 Once T ~ T c , we see from Figure 16.1 that the scalar field is now able to 
roll downhill away from 0 = 0 and so the GUT phase transition occurs. Provided 
that the potential is sufficiently flat, the slow-roll approximation holds and the 
universe inflates, the evolution of the scalar field being determined by (16.12) 
and the Hubble parameter by (16.13). If the potential is roughly constant then 
the exponential expansion continues. The rapid growth of the scale factor once 
again causes the evolution of the potential function as the temperature drops. The 
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duration of this period of inflation depends critically on the flatness and length 
of the plateau of the V(4>) function for T < T c . For certain ‘reasonable’ potentials 
the universe can easily inflate in such a way that the number of e-foldings 
N > 60, and can be considerably larger. This is therefore the main inflationary 
phase. According to detailed calculations, phase 3 occurs between t l ~ 10 _36 s 
and t 2 ~ 10” 34 s and the scale factor increases by a factor of around 10 50 . 

Phase 4 Eventually, the slow-roll approximation fails and inflation ends. The 
scalar field then rolls rapidly down towards the true vacuum state at c p = a, 
oscillating about the minimum point, and follows the behaviour outlined in 
Section 16.6. In particular, if V(<t) = 0 then the universe will revert to the standard 
radiation-dominated Friedmann model with 

R(t) oc t l/2 . 

Hence, at t ~ 10 _34 s, the universe starts a standard Friedmann expansion, albeit 
with the desired ‘initial’ conditions. Thus, the inflationary model incorporates all 
the observationally verified predictions of the standard cosmological models. 

Although the ‘new’ inflation model still has its advocates, it suffers from 
undesirable features. In particular, the scenario only provides an effective period 
of inflation if V(f>, T) has a very flat plateau near f> = 0, which is somewhat 
artificial. Moreover, the period of thermal equilibrium prior to the inflationary 
phase (so one can speak sensibly of the universe having a particular temperature) 
requires many particles to interact with one another, and so already one requires 
the universe to be very large and contain many particles. Finally, the universe 
could easily recollapse before inflation starts. As a result of these difficulties, new 
inflation may not be a viable model, and so there arc strong theoretical reasons 
to believe that the inflaton field cj> cannot be identified with the GUT symmetry¬ 
breaking Higgs field. Thus, the hope that GUTs could provide the mechanism for 
the homogeneity and flatness of the universe may have to be abandoned. 


16.10 Chaotic inflation 

In more recent models of inflation, the scalar field <J) is not identified with 
the Higgs field in the GUT phase transition but is regarded as a generic scalar 
field present in the very early universe. In particular, these models invoke the 
idea of chaotic inflation. In this scenario, as the universe exits the Planck era 
at t ~ 10~ 43 s the initial value of the scalar field </> start is set chaotically, i.e. it 
acquires different random values in different regions of the universe. In some 
regions, <^> start is somewhat displaced from the minimum of the potential and 
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V(f) 



Figure 16.2 The potential V(4>) — | m 2 <f) 2 for a free scalar field. The field is 
initially displaced from the minimum of the potential due to chaotic initial 
conditions as the universe comes out of the Planck era. 

so the field subsequently rolls downhill. If the potential is sufficiently flat, the 
field is more likely to be displaced a greater distance from its minimum and 
will roll slowly enough, and for a sufficiently prolonged period, for the region to 
undergo an effective period of inflation. Conversely, in other regions <^ start may 
not be displaced sufficiently from the minimum of the potential for the region to 
inflate. Thus, on the largest scales the universe is highly inhomogeneous, but our 
observable universe lies (well) within a region that underwent a period of inflation. 

According to this scenario, inflation may occur even in theories with very 
simple potentials, such as V{f>) ~ <// ! , and is thus a very generic process that can 
take place under a broad range of conditions. Indeed, the potential function need 
not depend on the temperature T. A very simple example is a free scalar field, 
for which V(4 >) = \tn 2 tjr (see Figure 16.2). Moreover, in the chaotic scenario, 
inflation may begin even if there is no thermal equilibrium in the early universe, 
and it may even start just after the Planck epoch. 


16.11 Stochastic inflation 

A natural extension to the chaotic inflation model is the mechanism of stochastic 
(or eternal) inflation. The main idea in this scenario is to take account of quantum 
fluctuations in the evolution of the scalar field, which we have thus far ignored 
by modelling the field entirely classically. If, in the chaotic assignment of initial 
values of the scalar field, some regions have a large value of 0 start then quantum 
fluctuations can cause A to move further uphill in the potential V(cb). These 
regions inflate at a greater rate than the surrounding ones, and the fraction of the 
total volume of the universe containing the growing </>-field increases. Quantum 
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fluctuations within these regions lead in turn to the production of some new 
inflationary regions that expand still faster. This process thus leads to eternal 
self-reproduction of the inflationary universe. 


16.12 Perturbations from inflation 

We have seen that inflation can solve the horizon and flatness problems. Arguably 
its greatest success so far, however, is to provide a mechanism by which the 
fluctuations needed to seed the development of structure within the universe can 
be generated. This topic is the subject of much current research, and we can give 
only a limited treatment here. Nevertheless, by following through the equations for 
structure generation and development in the simplest case, namely for a spatially 
flat universe with a simple ‘gauge choice’ (see below), we hope that the reader 
will be able to get a flavour of the physics involved. 

The current opinion of how structure in the universe originated is that it was via 
amplification, during a period of inflation, of initial quantum irregularities of the 
scalar field that drives inflation. Thus what we need to do can be divided into two 
broad categories. First, we need to work out the equations of motion for spatial 
perturbations in the scalar field. This can be done classically, i.e. taking the scalar 
field as a classical source linked self-consistently to the gravitational field via a 
classical energy-momentum tensor. Second, we need to derive initial conditions 
for these perturbations, and this demands that we understand the quantum field 
theory of the perturbations themselves. This sounds formidable but actually turns 
out to be no more complicated than considering the quantum physics of a mass 
on a spring, albeit one in which the mass changes as a function of time. These 
topics arc discussed in detail in the remainder of this chapter. 


16.13 Classical evolution of scalar-field perturbations 


We assume that the scalar field 0, which hitherto has been a function of cosmic 
time t only, now has perturbations that arc functions of space and time. We can 
thus write 


4>(t) -> 0o {t) + 8(f){t,x). 


(16.19) 


These perturbations will lead to a perturbed energy-momentum tensor, which we 
shall derive shortly. The Einstein field equations then imply that the Einstein 
tensor is also perturbed away from its background value. In turn, therefore, we 
must have a metric different from the Friedmann-Robertson-Walker one assumed 
so far. We thus need to assume a form for this metric in order to calculate the 
new Einstein tensor. It is at this point we must make the choice of ‘gauge’ (i.e. 
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coordinate system) referred to above. Once perturbations arc present there is no 
preferred way to define a spacetime slicing of the universe. The details of this arc 
quite subtle but amount simply to the fact that by choosing different coordinate 
systems we can change the apparent character of the perturbations considerably. 
For example, suppose that we choose, as a new time coordinate, one for which 
surfaces of constant time have a constant value of the new perturbed scalar field 
on them. This is always possible and, in such a gauge, the spatial fluctuations of 
f have apparently totally vanished! 

To meet such problems, methods that deal only with gauge-invariant quantities 
have been developed. We will make contact with such methods below, when 
we introduce the so-called ‘curvature’ perturbations. These arc gauge invariant 
and therefore represent physical quantities. To reach this point, however, we first 
work with a specific simple form of gauge known as the as the longitudinal 
or Newtonian gauge, and indeed with a restricted form of this - one where 
only one extra function (known here as a ‘potential function’) is introduced. 
The justification for using such a restricted form is that it leads to an Einstein 
tensor with the correct extra degrees of freedom to match the extra terms in the 
scalar-fie Id energy-momentum tensor arising from the field perturbations. 

For a spatially flat (k = 0) background FRW model, which is what we will 
assume, we adopt Cartesian comoving coordinates and write the perturbed metric as 


ds 2 = (1 + 2<F) dt 2 - (1 - 2$) R 2 (t ) ( dx 2 + dy 2 + dz 2 ), 


(16.20) 


where is a general infinitesimal function of all four coordinates (and should 
not to be confused with the scalar field r/r). Its assumed smallness means that we 
will only need to consider quantities to first order in d>. A general discussion of 
this linearising process is presented in the next chapter, but for the time being we 
simply note that one can consider <h as representing the Newtonian potential of 
the perturbations. For instance, for a spherically symmetric perturbation of mass 
M and radius r, if we put <J> = GM/rc 2 then the first term of (16.20) recovers the 
tt- term of the Schwarzschild metric. 


The perturbed Einstein field equations 

We now need to find both the new energy-momentum tensor of the scalar field 
cj> and the new Einstein tensor corresponding to our perturbed metric. Equating 
them will link our two perturbation variables Scf) and and provide us with the 
equations of evolution that we need. The first step is to calculate the connection 
coefficients corresponding to the perturbed metric (16.20) to first order in <F. 
These are easily shown to take the form T ,r l±l , = (l'o)' 7 /Jt , + bF”" M „, where the first 
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term corresponds to the connection coefficients of the unperturbed metric (i.e. 
with $ = 0) and the perturbation terms arc given by 


5r° 0ya = d^, 
sr 0 ,,- = 
sr'oo = 


In these expressions, H = R/R is the Hubble parameter of the unperturbed back¬ 
ground, and no sum over repeated i indices is implied. The remaining perturbed 
connection coefficients either follow from symmetry or arc zero. 

These connection coefficients yield a Riemann and hence an Einstein tensor. 
Again working to first order in d>, the perturbed paid of the Einstein tensor is 
found to be 


8G? = -2d f (6 + 7M>), 

SG[j = -2(V 2 <D - 37/6 - 3 // 2 d>), 
8G\ = 2[<E> + 4®H + (2 H + 3// 2 )d>], 


(16.21) 


where again no sum over repeated i indices is implied and the remaining entries 
either follow from symmetry or arc zero. The symbol V 2 here denotes the spatial 
Laplacian, which in this simple flat case is given by 


R 2 \dx 2 dy 2 ^ dz 2 J 


(16.22) 


It is worth noting that, in the entries of (16.21), the time derivative of the Hubble 
parameter appeal's. From (16.14), this can be rewritten as 




(16.23) 


remembering that this equation now applies to the background FRW spacetime. 

We also need to evaluate the perturbed paid of the scalar-field energy- 
momentum tensor. Substituting (16.19) into (16.7) and working to first order in 
4>, one quickly finds that 


8Tf = j> 0 d i (8<l>), 

8Tq = — T ~b V 8(f>, 
8Tj = q<T — ^> 0 8f> + V'8(f), 


(16.24) 
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where V' = dV/dxf) {) and the remaining components either follow from symmetry 
or are zero. 

We may now use the Einstein field equations to relate the Einstein tensor and 
the scalar-field energy-momentum tensor. Since the unperturbed paid of the field 
equations is automatically satisfied, one simply requires that 8G V ^ = — 8T 1 ^ (since 
k = 877 -G/c 4 equals unity in our chosen system of units). We may thus equate, 
with the inclusion of a minus sign, the components shown in equations (16.21) 
and (16.24). At first sight, it is by no means obvious that we have allowed 
ourselves enough freedom in including only one extra function, d>, in the metric. 
Nevertheless, as we now show, everything in fact works out. Let us staid with the 
(°)-components, for which we have the equation 

23, (<j>+ //<£) = </> o 3,(S0). (16.25) 


Remembering that H and <p 0 have no spatial dependence, we can integrate this 
immediately to obtain 




(16.26) 


One next equates the ('.)-components, which gives 

-2[<f> + 4<£>H + (2 H + 3// 2 )<E] = - ^ 0 8f> + V'Scf, (16.27) 


but we may show that this contains no information beyond that already obtained 
from the (°)-components. In particular, differentiating (16.26) with respect to time 
gives 

<E> H- + //<i> = ^'cj) 0 8(f) + \(f 0 84>\ (16.28) 

then, using equations (16.9) and (16.23) to substitute for <p 0 and H respectively, 
one finds that (16.27) is satisfied if (16.26) holds, thus establishing consistency. 
The only new information must therefore come from equating the (|j) -components. 
Using (16.28) and eliminating V' again then yields 



(16.29) 


Perturbation equations in Fourier space 
The results (16.26) and (16.29) are the basic equations relating <f> and 8f>. To 
make further progress, however, it is convenient to work instead in terms of 
the Fourier decomposition of these quantities and analyse what happens to a 
perturbation corresponding to a given comoving spatial scale. Thus, we assume 
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that <f> and 8(f) arc decomposed into a superposition of plane-wave states with 
comoving wavevector k, so that 

<*>(*) = ^ly/2 / ex P(^' *) 


where (with a slight abuse of notation) x = (x, y, z) and a similar expression 
holds for 8cj). The evolution of a mode amplitude <I>? depends only on the 
comoving wavenumber k = \k\: the corresponding actual physical wavenumber 
is k/R(t). We thus work simply in terms of O* and 8(f> k . In terms of these 
variables, the action of V 2 will be just to multiply d> /f by —k 2 /R 2 (t), whereas 
the time derivatives remain unchanged. Equations (16.26) and (16.29) therefore 
become 


<& k + H(f> k — j 4>o8<l> k , 



% 


d_ / 8<f> k \ 

dt V ) 


(16.30) 


Thus, we see that we have obtained two coupled first-order differential equations 
for the quantities <S> k and 8(f> k , which arc the amplitudes of the plane-wave 
perturbations of comoving wavenumber k in the metric and in the scalar field 
respectively. Clearly, what we could do next is to eliminate one quantity in terms 
of derivatives of the other and then obtain a single second-order equation in 
terms of just one of them (plus the background quantities, of course, but the 
evolution of these is assumed known). In fact, this leads to rather messy equations 
and, moreover, in terms of the discussion given above the results arc not gauge 
invariant , since neither d> /f nor 8<f> k is gauge invariant on its own. 


16.14 Gauge invariance and curvature perturbations 

As mentioned above, gauge invariance is related to how we define spatial ‘slices’ 
of the perturbed spacetime. By transforming to a new time coordinate, one can 
apparently make the perturbations in the scalar field come and go at will. There 
arc two ways to take care of this difficulty. First, one can choose variables that arc 
insensitive to such changes and therefore definitely describe something physical. 
These are called gauge-invariant variables. Second, one can use variables which 
would change if one altered the slicing but which are defined relative to a particular 
slicing that can itself be defined physically. These arc then also physical variables 
and are, perhaps confusingly, also sometimes called gauge invariant, although this 
is not really a good description. Note that changing spatial coordinates within a 
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particular slicing also induces changes, but these arc not relevant to our discussion 
here and we concentrate just on changes in time coordinate. 

Let us start our discussion by taking the first of the two routes outlined above, 
namely describing the perturbations in terms of truly gauge-invariant quantities. 
For any scalar function / in spacetime, consider the effects upon it of the change 
in time coordinate t —> t' — t + At. We may define a new, perturbed, function by 

/(O = /(*)• (16-31) 

where, as just stated, we suppress the /-dependences in what follows. Thus, to 
first order in A t, we may write 

/'(*) = /'(!'- At) « m-f At (16.32) 


where we do not have to specify whether it is / or f that is being differentiated 
with respect to time to obtain f or whether the latter is evaluated at t or t', since 
these would be second-order differences. Hence the perturbation in the scalar 
function due to the ‘gauge transformation’ t —> t + At is given by 4 


A / = —/At. 


(16.33) 


We may now evaluate the change in the perturbed spacetime metric corre¬ 
sponding to the gauge transformation t —> t + At. To do this, however, one must 
distinguish between the two occurrences of the <F-variable in (16.20). For an 
arbitrary scalar perturbation, the general form of the perturbed metric in fact 
takes the form 


ds 2 = (1 + 2¥) dt 2 - (1 - 2$) R 2 (t) (dx 2 + dy 2 + dz 2 ), (16.34) 

in which 4 7 and <F arc different functions. Nevertheless, for matter with no 
‘anisotropic stress’ (so that all the off-diagonal components of the space paid 
of the stress-energy tensor are zero), the two functions may be taken as equal; 
this is the case for a perfect fluid or a scalar field and hence leads to (16.20). 
Even in this case, however, the two functions behave differently under the gauge 
transformation. We need consider only the T-function above, which clearly takes 
the role of a spatial curvature term since it modifies the space paid of the metric 
by a multiplicative factor. Under t —»■ t + At we find that 

f? 2 (l - 2<D) -> R 2 ( 1 - 2<F) + [2i?/?(l - 2<D) - 2/? 2 <fc] At. (16.35) 


This is the simplest version of the ‘Lie derivative’, which describes the change in a (possibly tensor) function 
when ‘dragged back’ along ‘flow lines’ in parameter space; see, for example, B. Schutz, Geometrical Methods 
of Mathematical Physics, Cambridge University Press, 1980. 
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Since both <F and A t arc infinitesimal, we may employ the same arguments that 
led to (16.33). Then, to first order, we have 

A<F = //Af, (16.36) 

where we have also used the fact that RR = R 2 H. Thus, for any scalar function 
/ with perturbations 8f, we see that the combination 

HBf 

£1 = ®+^ (16-37) 

is gauge invariant under the gauge transformation t t + A/ that we arc consid¬ 
ering, since to first order we have 

if C'f = + HAt + H(8f 7^ Af) - i f . (16.38) 

/ 

Thus, for the specific example of our scalar-fie Id perturbation 8cf), we may 
identify the corresponding gauge-invariant quantity as 


i=®+ 


H 8(1) 


(16.39) 


We will therefore use this variable (or its Fourier transform) in our subsequent 
discussion in later sections. In the literature this quantity is called the curvature 
perturbation , for reasons that will become clear shortly. 

Before going on to consider the evolution of these curvature perturbations, 
let us first discuss briefly the second route outlined at the start of this section 
for defining a physically meaningful perturbation variable. This route can be 
illustrated directly with the (^-function, and one begins by defining the quantity 


X= -0|, 


(16.40) 


where the subscript indicates that is to be evaluated on comoving slices. By 
‘comoving’ we mean a time-slicing that is orthogonal to the worldlines of the 
‘fluid’ that makes up the matter. For an ordinary fluid, this would amount to 
choosing frames in which, at each instant and position, the fluid appeal's to be 
at rest. The same applies here and, because the frame involved is physically 
defined, the variable 31, which measures the spatial curvature in the given frame, 
is itself physically well defined. Thus the quantity 31 is also called the ‘curvature 
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perturbation’ in the literature. As we now show, it is in fact equal to minus the 
variable £ defined in (16.39), and so both may be described as such. 

For any scalar density perturbation 8p, one can write the spatial curvature in 
the comoving slice as 



(16.41) 


Let us therefore consider what happens for the particular case of a perturbation in 
a scalar field. As shown in (16.24), the (^-components of the perturbed stress- 
energy tensor read 

8lf = <j> 0 d i (8<l>). (16.42) 


In the comoving frame, this momentum density must vanish, by definition, and 
so the scalar-field perturbation cannot depend on the spatial coordinates and thus 
vanishes. Hence, for a scalar field, we have 



(16.43) 


16.15 Classical evolution of curvature perturbations 

We now consider the evolution of the Fourier transform of the gauge-invariant 
perturbation (16.39), namely 


„ 8f> k 

4 = o>* + //-p, 


(16.44) 


which is clearly itself gauge invariant. Using (16.30), the second-order differential 
equation satisfied by this quantity is quite simply shown to be 


(16.45) 


Given a potential V(4 > 0 ) and some initial conditions for H and c/> 0 , we can integrate 
the background evolution equations numerically and obtain H and 4> 0 as functions 
of cosmic time t. If we simultaneously integrate £k using (16.45), we can thereby 
trace the evolution of the curvature perturbation over the time period of interest. 
An example of the results of this procedure is shown in Figures 16.3 and 16.4, 
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In t 

Figure 16.3 Evolution of the logarithm of the comoving Hubble distance 
ln[ 1 /(/?//)] versus Inf (solid line) in a chaotic inflation model driven by a free 
scalar field of mass m ~ 2 x 10 -6 , the initial values of H and </> 0 being chosen in 
such a way that there is a period of inflation lasting approximately for the period 
Inf 11-16. Also shown (broken line) is the fixed comoving scale 1 /k, where 
k — 10 4 is the comoving wavenumber of the perturbation shown in Figure 16.4. 
Note that all quantities are in Planck units. 



Figure 16.4 Evolution of the curvature perturbation £ k versus Inf for k — 10 4 
in a chaotic inflation model driven by a free scalar field of mass m ~ 2 x UP 6 . 
Note that all quantities are in Planck units. 
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for the particular choice of potential V(4> 0 ) = \ni 2 (j)j } (chaotic inflation) with 
m ~ 2 x 10 6 (a typical value in such theories). The initial conditions for H and 
4> q were chosen to give inflation over the period In? ~ 11-16, and the comoving 
wavenumber of the perturbation 5 was chosen as k = 10 4 . 

From Figure 16.3, one can verify that the universe is indeed inflating during the 
period In t ~ 11-16, since the comoving Hubble distance l/(RH) is decreasing 
with cosmic time (see Section 16.1). In inflationary theory, this quantity is loosely 
called the ‘horizon’ but must be distinguished from the ‘particle horizon’, as 
discussed in Section 15.12. The broken line in Figure 16.3 is the natural logarithm 
of the reciprocal of the comoving wavenumber k, which is of course constant for 
a given perturbation. This reciprocal, 1 /k, gives another dimensionless scale and 
(ignoring possible factors of 2tt that, one could argue, should be introduced) can 
be thought of as the comoving wavelength scale of the perturbation itself. 

The behaviour of the curvature perturbation is shown in Figure 16.4 for 
k = 10 4 and can be understood from the behaviour of the comoving Hubble 
distance (or horizon) in Figure 16.3. 6 Whilst the perturbation scale l/k is less 
than the horizon radius 1 / (RH) the curvature perturbation £ k just oscillates. Once 
the comoving horizon radius has dropped below l/k, however, we see that (at 
In? ~ 13) the perturbation suddenly ‘freezes’ and no longer oscillates. We speak 
of this moment, when l/k becomes greater than 1 /(RH), as the perturbation 
‘leaving the horizon’ and, in intuitive terms, we can understand that beyond this 
point the perturbation is no longer able to feel its own self-gravity, since it is larger 
than the characteristic scale over which physical processes in the universe operate 
coherently. The curvature perturbation thereafter remains frozen at whatever value 
it has reached at this point until much later in the history of the universe, when 
the comoving horizon scale eventually catches up with 1 /k again. At this point, 
the perturbation is said to ‘re-enter the horizon’, and oscillations will begin again 
(though at this stage it is not expected that these will be in the scalar field itself, 
since the latter is thought to decay into other particles, via the process of reheating, 
shortly after inflation ends - see Section 16.6). 

The key point to note is that, via inflation, one has produced ‘super-horizon’ 
scale fluctuations in the early universe. These fluctuations later go on to provide 
the seeds for galaxy formation and the perturbations in the cosmic microwave 


5 Note that all quantities here are measured in Planck units, e.g. the masses are in inverse Planck lengths and 
the times in Planck times. 

6 The initial conditions used for examining the classical behaviour of ( k can of course be chosen arbitrarily. The 
starting values of and its time derivative used in Figure 16.4 in fact correspond to ‘quantum’ conditions, 
where field-theoretic values for the initial fluctuation are set. This is discussed in Section 16.6 below, where a 
new variable £ k , related to £ k , is introduced. The specific values used correspond to evaluating the imaginary 
part of equation (16.51) and its time derivative, followed by a global phase shift such that the initial phase 


is zero. 
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background radiation that we observe today. By studying the distribution of 
galaxies and CMB fluctuations as a function of scale, it is possible to obtain an 
idea of the underlying primordial spectrum of perturbations that produced them. 
Thus, by predicting this primordial spectrum, we can perform a test of the whole 
inflationary picture for the origin of fluctuations. This is obviously an area of great 
current interest. We can give only a simplified treatment, but the basic equations 
arc within our reach, as we now discuss. 


16.16 Initial conditions and normalisation of curvature perturbations 

The key concept we need for predicting the primordial spectrum of perturbations 
produced during inflation can be stated in the following question: what sets the 
initial conditions for the perturbation itself? If we knew this for each k, then, 
since the evolution of g k through to the point where it freezes would be known, 
given the evolution of the background model we could compute a spectrum of 
curvature perturbations as a function of k. 

The basic idea for setting the initial conditions for the perturbations is that they 
come from quantum-field-theoretic fluctuations in the value of the scalar field tb. 
Thus the ‘classical’ perturbations discussed above need to be quantized, in a field 
theory sense, and this will allow their initial values to be set. A rigorous way 
of performing this quantisation has been developed 7 and, although the process is 
complicated, the final result in our case is very simple. To apply the result, we 
must first make two changes of variable in our discussion above. 

• Convert from cosmic time t to a new dimensionless time variable 17 known as ‘confor¬ 
mal time’ and defined by clrf/dt = c/R. 

• Convert from the curvature perturbation 'L, k to a new variable Z t given by £ t . = a£ k , 
where a — Rf> 0 /H. 

The formal procedure then shows that the correct quantisation may be achieved 
simply by treating f k as a free complex scalar field and quantising it in the standard 
fashion. The evolution equation for the quantum perturbations turns out to be 
identical to the ‘classical’ equation for f k . Thus, having fixed initial conditions 
for f k quantum mechanically, one may follow the classical evolution. 

Let us first derive the classical evolution equation for f k . Making the trans¬ 
formation of variables noted above, equation (16.45) becomes even simpler. In 
particular, the intermediate variable a was chosen in order to remove the first- 
derivative term in (16.45), so as to make it more like a simple harmonic oscillator 


7 See, for example, V. F. Mukhanov, H. A. Feldman & R. H. Brandenberger, Theory of cosmological pertur- 
bations, Physics Reports 215 , 203-333, 1992. 
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equation. Using a prime to denote a derivative with respect to conformal time, 
we obtain 


ft'+ U 2 - 


& = o. 


(16.46) 


It is now clear that we arc dealing with the equation for the Ath mode of a scalar 
field with a time-variable mass given by = —a"/a. The explicit expression 
for this effective mass is given in terms of the background quantities by 


2 _ ( tff _ ^0^0 _ 4>' p _ jfp_ 
2 RH 2 R 2 H 2 4>' {) ’ 


= -2 R 2 H 2 


, _H_ , , Ml i 3 ^o , \ 

2H 2 4H 4 H* 2H(f) 0 2 H 2 4>J 


(16.47) 

(16.48) 


where, in the last line, we have re-expressed the result in terms of derivatives with 
respect to cosmic time t rather than conformal time, which we will find useful 
momentarily. Perhaps surprisingly, it is the <!>^/<!> {) term in (16.47) that gives rise 
to the leading-order term 2 R 2 H 2 in (16.48)! 

To set the initial conditions for g k , we will study the variable-mass term m 2 in 
the form (16.48). In the ‘slow-roll’ approximation, (b l} and higher derivatives were 
neglected. Furthermore, here we shall assume that f> 0 <^ H during the periods of 
interest. In this case m\ ~ —2 R 2 H 2 and (16.46) becomes 


?' + {k 2 -2R 2 H 2 )f k = Q. 


(16.49) 


In this form, we can see the origin of the behaviour discussed above in terms of 
a perturbation ‘leaving the horizon’. When k 3> RH the perturbation length scale 
is within the horizon (since l/k I /(RH)) and we have oscillatory behaviour. 
When k 4C RH , however, the perturbation length scale exceeds the horizon and 
we have exponential growth in f k . Moreover, in the latter case we see directly 
from (16.46) that, if k can be neglected, we may immediately deduce the solution 
ij k oc a. Since = af k , this means that the curvature perturbation ( k is constant, 
which is exactly the behaviour seen in Figure 16.4. 

Let us now consider further the regime k RH. when the perturbations arc well 
inside the horizon, which is where the initial conditions for f k can be set. In this 
regime, (16.49) becomes simply the harmonic oscillator equation f k + k 2 f k = 0, 
the quantisation of which is well understood. This quantisation demands that the 
norm of any state evaluates to unity in Planck units, or equivalently that the 
conserved current of the field \ is unity, so that 


= i. 


(16.50) 
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It is this condition that sets the absolute scale of the perturbations. Hence, the 
properly normalised positive-energy solution in the regime k RH is given (up to 
a constant phase factor) by 


4 = 


1 

V2k 


exp {—ikrj). 


(16.51) 


which is therefore the form to which any solution of (16.49) must tend well within 
the horizon. 

We may now attempt to obtain a full solution to (16.49) and can in fact achieve 
this quite simply. Consider the following series of manipulations concerning the 
conformal time p, in which we carry out an integration by parts: 


r dt r dR 

~R-J WH 


1 

1 

f dH 

RH 

-J 

RH 2 

1 

t 

f H dR 

RH 

-J 

Tt-Wh 

1 ' 


r H dt 

RH 

2 H 2 R ' 


Again ignoring a term in tfi^/H 2 , we can thus write 


V = Vend 


RH 


(16.52) 


where p end is the value at which the conformal time saturates at the end of 
inflation (that it does indeed saturate is obvious from the facts that dr]/dt = l/R 
and that R is increasing exponentially during inflation). Figure 16.5 shows that 
(16.52) is indeed a good approximation during inflation in our current numerical 
example. Equation (16.49) now becomes 


4 ' + 


k 2 — 


(Vend - V) 2 J 


4 = o, 


(16.53) 


which finally is exactly soluble. There is a unique solution (up to a constant phase 
factor) that tends to (16.51) for small p; it is given by 


4 = 


1 i + k(r) eDd -r)) . 

y—ikl 1 

V2¥ Vend - V 


(16.54) 


By inspection this has the correct property for p <<C p end provided that A:p end 1. 
Comparison with Figure 16.5 shows that this is indeed the case for ^-values of 
interest (for the figure, k = 10 4 and p end ~ 0.64). 
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Figure 16.5 Evolution of conformal time 17 for the same numerical case as that 
illustrated in Figures 16.3 and 16.4 (solid curve). The broken curve shows the 
approximation given in equation (16.52), which is seen to be very good once 
inflation starts, around In f ~ 11 . 


Now that we have a correctly normalised general solution for let us consider 
the regime k <£ RH at which the perturbation length scale exceeds the horizon. 
We use (16.52) to rewrite the solution just found as 

f ‘=v!F (k+iRH)e ‘ tlRH(1655) 


where the final expression is valid for k <SC RH. Thus, for such modes, 
, = £k„ i H 2 
k a V2F 0o ’ 


(16.56) 


Since we have demonstrated that is constant after the mode has left the horizon, 
this means we are free to evaluate the right-hand side at the horizon exit itself. 
We therefore write schematically 


/ ( r2 \ 


V2F Uo/ 

k=RH 


(16.57) 


This is a famous and important result in inflationary theory; it gives the (constant) 
value of the amplitude of the plane-wave curvature perturbation having comoving 
wavenumber k for modes whose length scale exceeds the horizon. 
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16.17 Power spectrum of curvature perturbations 


From the result (16.57), we can deduce an expression for the power spectrum 
IP^(k ) of the primordial curvature perturbations. The precise definition of this 
spectrum is a matter of convention, of which there arc several, but we will adopt 
the most commonly used. In this case, the power spectrum of a given spatially 
varying field is defined as the contribution to the total variance of the field per unit 
logarithmic interval in k. Thus, we define the curvature spectrum IP^(k) such that 


<£(*)£*(*)> = f* ? £ {k) dQnk), (16.58) 

where (• • •) denotes a expectation value and the total spatial variation of the 
curvature perturbations is 

£(*) = / £% exp (ik ■ x) d 3 k. (16.59) 

In these expressions, x refers to comoving coordinates and k = |k|. Evaluating 
(£(x)£*(x)), and remembering that cl 3 k = 47 rk 2 dk, one finds that (16.58) is 
satisfied providing that 

where 8^ (k — k') is the three-dimensional delta function. We may therefore write 
P £ (k) = k 3 (\£ k \ 2 ) / (2 tt 2 ) and, using (16.57), we finally obtain 8 



2 

k=RH 


(16.60) 


In the slow-roll approximation, we know that H is only slowly decreasing 
whilst <])( } is approximately constant. To a first approximation, therefore, the 
power spectrum of the perturbations expected from inflation, as measured by 
the contribution to the total fluctuation per unit logarithmic interval, is constant. 
Such a spectrum is called scale invariant and was proposed in the late 1960s as 
being the most likely to lead to structure appropriately distributed over the scales 
we see today. It is also known as a Harrison-Zel ’dovich spectrum, after its two 
co-proposers. Here we can see it emerging as a prediction of inflation. We can 
go further, however, by noting that, during inflation, H is slowly declining, (f> {) 
is approximately constant and R is increasing exponentially. Thus modes with 


We note that the quantity tP^(k) is often written using the alternative notation Al(k). In addition, it is common 
to define the quantity P^(k) = (|fd 2 ), which is also often called the power spectrum and is related to ,7^.( k ) 
bytP ( (k) = k i P ( {k)/{27r 2 ). 
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higher k, which leave the horizon later in time, have a slightly lower value of 
XAk) since H is lower there. As a result, the spectrum is predicted to be not 
exactly constant, but slightly declining, as a function of k. The details of this 
depend on the details of the potential V(f/r 0 ), but we can see from the analysis 
here that this is a generic prediction of inflation (assuming that slow-roll is an 
accurate model). 

Before going on to discuss in the next section the comparison of the prediction 
(16.60) with cosmological observations, it is worthwhile re-deriving this result 
in a more heuristic (and perhaps enlightening) manner. For a scalar field in an 
ordinary Minkowski spacetime, the zero-point uncertainty fluctuation is given by 


8 \ 


1 e-'V 

v^n-jw v 


(16.61) 


for a mode with physical wavenumber k p , where V is a normalising volume. Here, 
instead of k p , we wish to use the comoving wavenumber k, which is related to 
the physical wavenumber by k = Rk p . Moreover, an obvious length scale for the 
normalising volume is the scale factor R. Thus, in our expanding FRW spacetime, 
we assume that 

p-ikt/R 

<16 - 62) 

As explained above, the corresponding power spectrum of the fluctuations 8<J) k 
is obtained by multiplying its squared norm by AitA / (2tt)^ , which gives 



(16.63) 


As above, we have evaluated the second expression at the ‘horizon crossing’ 
value of k, RH , since fluctuations on larger length scales arc ‘frozen in’ at the 
value they reached at this point. To translate this result into the power spectrum 
of curvature perturbations X, we need to link X and 8(f). Consider the change A t 
in time coordinate that would be needed to move between the ‘comoving slicing’, 
in which 8f> vanishes, to a ‘flat slicing’, in which d> vanishes. Since £, as defined 
in (16.39), remains constant in this process, we see that, in this case. 


Ad) = d> = — X — HAt and Acj) = Scf) = — 0 o At. (16.64) 

Eliminating At we find that X = H8<l)/(f) (] and hence we recover the result 



(16.65) 
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16.18 Power spectrum of matter-density perturbations 

As discussed in Section 16.6, at the end of inflation the scalar field decays 
into other particles. Thus, one is left with a spectrum of curvature perturbations, 
which from (16.40) is equivalent to the spectrum of fluctuations in the gravita¬ 
tional potential 5<J> in a comoving slicing. These in turn may be related to the 
corresponding fluctuations Sp in the matter density. The full general-relativistic 
equations describing the evolution under gravity of these density fluctuations may 
be obtained by repeating all the above discussion for a perfect fluid rather than 
a scalar field. We will not pursue this calculation here but merely note that the 
resulting equations arc the same as those obtained using Newtonian theory, except 
for a term that is important only on super-horizon scales. Therefore, on sub¬ 
horizon scales, to a good approximation we may take these potential fluctuations 
as obeying the perturbed Poisson’s equation in Newtonian gravity, 

V 2 (Sd>) = 4ttG(8 P ), 


where Sp is the fluctuation in the matter density corresponding to that in the 
potential. Indeed, we might have expected the Newtonian theory to be a good 
approximation on sub-horizon scales since the gravitational field associated with 
the perturbations is weak. 

It is more common to work instead in terms of the fractional-density fluctuation 
S = Sp/p 0 , where p 0 is the background matter density. Thus, working in Fourier 
space, we have 




4ttGp 0 R 2 
—~ 5t ' 


Using p 0 = 3H 2 /(SttG), for the simple spatially flat case that we arc considering 
we see that 


8 k 



(16.66) 


from which we deduce that (|i5 a | 2 ) or k 4 (|i5d> /f | 2 ). Therefore, defining the matter 
power spectrum by Pg(k) = (|5 A .| 2 ) (note that this differs from definition of P%(k) 
by a factor k 2, /{2 tt 2 ), as mentioned earlier), we find that Pg(k) or klP x (k). Since 
'■P x (k) is roughly constant for slow-roll inflation, we thus obtain 


Pg(k) or k. 


(16.67) 


In general, the matter power spectrum is parameterised as P§(k) or k", where n is 
known as the primordial spectral index. We therefore see that inflation naturally 
predicts n— 1, which is also known as the Harrison-Zel’dovich spectrum. 
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An alternative way of characterising this spectrum is to note from (16.66) that 
if we do not define the perturbation spectrum at a single instant of cosmic time 
but evaluate it when a given scale re-enters the horizon (k = RH) then 


s k « ®k- 


Since the spectrum defined as the contribution to the total variance per 

unit logarithm interval of k at a single instant of time, is roughly constant then 
so too is the matter power spectrum defined in the same way but evaluated at 
horizon entry. This is why the Harrison-Zel'dovich spectrum is also known as the 
scale-invariant spectrum. The fractional-density perturbations, as they enter the 
horizon, make a constant contribution to the total variance per unit logarithmic 
interval of k. 

Finally, we note from (16.66) that, at a given k , the time evolution of the 
fractional-density perturbation 8 k is given by 


For a radiation-dominated model we have R or / 1/2 and H = 1/(2 1), whereas for 
a matter-dominated model R or t 2 / 3 and H = 2/(3?). Thus, we find 


t (radiation-dominated), 

&k(t) ^ ■ 2/3 

t ' (matter-dominated), 

which provides a quick derivation of the time dependence of what is known as 
the growing mode of the matter-density perturbations. In particular, we note that 
the time dependence of this mode is the same as that of the scale factor R in the 
matter-dominated case. 


16.19 Comparison of theory and observation 

The details of the comparison of the inflationary prediction for the perturbation 
spectrum with cosmological observations would take us too far afield here. We 
thus content ourselves with two brief illustrations. Figure 16.6 shows the prediction 
for the power spectrum of anisotropies in the cosmic microwave background 
radiation, assuming an early-universe perturbation spectrum that is exactly scale 
invariant. The anisotropies in the temperature of the CMB radiation provide a 
‘snapshot’ of the (projected) density perturbations at the epoch of recombination 
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Figure 16.6 The predicted power spectrum of CMB temperature anisotropies 
(solid line), assuming an early-universe perturbation spectrum that is exactly 
scale invariant. The points show the results of recent observations of the CMB 
anisotropies by the Wilkinson Microwave Anisotropy Probe (WMAP, circles), 
Very Small Array (VSA, squares) and Arcminute Cosmology Bolometer Array 
Receiver (ACBAR, triangles) experiments. The vertical error bars indicate the 
68 per cent uncertainty in the measured value. 


(z rec ~ 1500). The CMB anisotropies over the sky are usually decomposed in 
terms of spherical harmonics as 

oo l 

A7T0, </>) = £ £ a, m F, m (M), 

1=2m=—i 

where the l = 0 (constant) and l = 1 (dipole) terms are usually ignored, since 
the former is unrelated to the anisotropies and the latter is due to the peculiar 
velocity of the Earth with respect to the comoving frame of the CMB. The power 
in the fluctuations as a function of angular scale is therefore characterised by the 
spectrum 

1 1 

Co — - £ \ a e,n\ 2 - 

1 2f + l m f£ tm 

The characteristic peaks in the predicted CMB power spectrum (solid line) are 
a consequence of another feature of inflation that we have already seen in our 
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equations, namely that all modes outside the horizon arc frozen and can only start 
to oscillate once they re-enter the horizon later in the universe’s evolution. This 
means that a ‘phasing up' is able to occur, in which all modes of interest staid 
from effectively a ‘zero velocity’ state when they begin the oscillations, during 
the epoch of recombination, that lead to the CMB imprints. This is what enables 
peaks to be visible in the power spectrum, with modes on different scales able 
to complete a different number of oscillations before the end of recombination. 
Coherence, leading to peaks, is maintained since each mode has the same starting 
conditions. This is only possible if the modes of interest arc indeed on super¬ 
horizon scales prior to recombination, and the only known way of achieving 
this is via inflation. Thus the peaks visible in the predictions of Figure 16.6 
are a powerful means of testing for inflation. The points shown in the figure 
are the results of recent observations of the CMB anisotropies by the WMAP 
(circles), VSA (squares) and ACBAR (triangles) experiments, which yield a very 
impressive confirmation of the peak structure and thereby a direct confirmation 
that inflation occurred. 



k (h 1 Mpc) 

Figure 16.7 The predicted power spectrum of matter fluctuations (solid line) 
assuming an early-universe perturbation spectrum that is exactly scale invariant. 
The points show the results derived from the 2dF sample of galaxy redshift 
measurments. The horizontal error bars indicate the width of the bin in k-space 
over which the measurement is made and the vertical error bars indicate the 
68 per cent uncertainty in the measured value. 
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From the current data it is, however, not possible to tell whether the primordial 
spectrum is exactly scale invariant, as assumed in generating the prediction, or 
whether it has the slight decrease at larger k, and therefore smaller scales, that we 
said was also a generic prediction of inflation. This question should be resolved by 
future experimental results, particularly from the CMB on smaller angular scales 
and from measurements of the matter distribution on a range of scales. In the 
latter case, one may compare the observed distribution of galaxies (both over the 
sky and in redshift) with the predicted power spectrum for matter fluctuations in 
the universe. The primordial matter power spectrum P§(k) in (16.67) is modified 
by the evolution under gravity of the perturbations once they re-enter the horizon. 
This effect may be calculated, and the predicted matter power spectrum resulting 
from an exactly scale-invariant primordial spectrum from inflation is shown as 
the solid line in Figure 16.7. Once again, we see that the predicted spectrum 
has oscillations resulting from an mechanism analogous to that which produces 
the oscillations in the CMB power spectrum discussed above. The points in the 
figure show the measurements derived from the 2dF (2 degree field) sample of 
galaxy redshift measurments. Again, a good fit to the data is visible, and time 
will tell whether the detailed dynamics of inflation, which can be measured by the 
departures from scale invariance, will become accessible from the combination 
of data of this type and future CMB experiments. 


Exercises 

16.1 In the cosmological field equation 

R 2 = l -pR 2 -k , 

show that, if p < — |p, the curvature term becomes negligible as the universe 
expands. 

16.2 Show that the energy-momentum tensor of a scalar field, 

Tpv = (''VM'V/j) - [\(d a <f>)(d ,r <l>) - V((f ))], 

satisfies the condition V^T 111 ' = 0. 

16.3 Show that a scalar field acts like a perfect fluid with an energy density and pressure 
given by 

P*= ^ 2 + VW>) + f(V <f) 2 , 

p* = \$ 2 -v{4>)-\<y 4>) 2 - 

Show further that if the field <l> is spatially constant then inflation will occur, 
provided that r/F < V( (b ). If, in addition, the scalar field does not change with time, 
show that its equation of state is p^ — —p^ and that it thus acts as an effective 
cosmological constant. 
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16.4 Show that the equation of motion for a scalar field with potential V(4>) is 

dV 

<fi + 3Hcj)+ —— = 0. 

dtp 

Hence find the general solution for a free scalar field </>, for which V — \m 2 (j) 2 , in 
the case where H is approximately constant. 

16.5 For a potential of the form 

V(4>) = V 0 exp(— X<f>), 

where A is a positive constant, show that the inflation equations can be solved 
exactly to give 

Rw = fl »(£f • 

Hence show that, provided A < a/ 2, the solution corresponds to a period of inflation. 
Show further that the slow-roll parameters for this model are e— \rj — (A 2 , and so 
the inflationary epoch never ends. This model is known as power-law inflation. 

16.6 Show that, in general, 

R = R(H + H 2 ). 

Show that H > 0 only if p < —p, which is forbidden by the weak energy condition 
(see Exercise 8.8). Hence show that, for inflation to occur, one requires 


and thus that the first slow-roll parameter must obey e < 1 . 

16.7 In the slow-roll approximation, show that 

Assuming that (b varies monotonically with t throughout the period of inflation, 
show that 

0=-2 

where H is now considered as a function of <f>, and hence that we may write the 
cosmological field equation as 

[H'm 2 - fH 2 (0) = -iV(0). 

This is known as the Hamilton-Jacobi formalism for inflation. 

16.8 Repeat Exercise 16.5 using the Hamilton-Jacobi formalism developed in Exer¬ 
cise 16.7. 
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In the Hamilton-Jacobi formalism developed in Exercise 16.7, show that the 
condition for inflation to occur is 


16.10 Show that, during an exponential expansion phase of the universe, the proper 
distance between any two comoving objects separated by more than H~ l grows 
at a speed exceeding the speed of light. Hence show that an observer in such a 
universe can only see processes occurring inside the ‘horizon’ radius H~ l , and so 
the process of inflation in any spatial domain of radius H~ l (or ‘mini-universe’) 
occurs independently of any events outside it. 

16.11 A fluctuation 8(b in the scalar inflation field leads to a local delay of the end of 
inflation by 8t ~ 8<f>/4>. Assuming that the density of the universe after inflation 
decreases as t~ 2 , show that the fluctuation in the scalar field leads to a relative 
density contrast at the end of inflation given by 

8p H8<p 


Assuming the root mean square (rms) scalar field perturbation to be 8<p ms ~ 
H/(2tt), show that 

/«p\ .. H2 

V P /rms 2lT (f) 

16.12 Consider an inflationary domain (or mini-universe in the context of Exercise 16.10) 
of initial radius H~ l , in which the value of the scalar field <b bz> 1- In a time interval 
At — H~ l , show that classically, in the slow-roll approximation, the value of 4> 
will change by 

<P 

Assuming that the typical amplitude of quantum fluctuations in the scalar field is 
8<p & H/(2Tt), show that 

277 V 3 

Hence, for the case V(4> ) = \m 2 f> 2 , show that the decrease in the value of the scalar 
field due to its classical motion is less than changes due to quantum fluctuations 
generated in the same time interval, provided that 

, 6 

<t >» ~j=- 
*Jm 

Assuming that the typical wavelength of the quantum fluctuation is 8<b is H~ x , 
show that, after a time interval At — H~ l , the original domain becomes effectively 
divided into e 3 ~ 20 domains of radius H~ x , each containing a roughly homoge¬ 
neous scalar field (b + A<b + 8<p. Thus, on average, the volume of the universe 
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containing a growing </>-field increases by a factor ~ 10 after every time interval 
At — H~ l . 

Note: This is the mechanism underlying stochastic inflation. 

16.13 For the line element 

ds 2 = (1+2$) dt 2 -( \- 2 ( l>)R 2 (t)(dx 2 + dy 2 + dz 1 ), 

show that, to first order in <E>, the perturbed parts of the connection coefficients 
take the form 


srV = d^, 

sr° u = ~ R : (<\> + 41 

sr'oo = -jpdfr 

Sr ',> =-dp®, 

where no sum over repeated i indices is implied and and the remaining perturbed 
coefficients either follow from symmetry or are zero. Hence show that the 
perturbed part of the Einstein tensor is given by 

8G° = -2 d,.($ + //$), 

§G° = —2(V 2 <J> — 3H<& — 3// 2< E), 

8G\ = 2[<J> + 4<i>//+ (2H + 3H 2 )<i>], 


where again no sum over repeated i indices is implied and the remaining entries 
either follow from symmetry or are zero. 

16.14 For the scalar-field perturbation 

<K0 <t>o(t) + S<f)(t, x). 


show that, to first order in 8cf >, the perturbed parts of the scalar-field energy- 
momentum tensor are given by 

8T° = <M,.(80), 

8Tff — — + 4>o8cf> + V'8(f>. 

8TI — 0g<f> — (f> 0 8cf)+ V'8(f>, 


where V' — dV/d<f> 0 and the remaining components either follow from symmetry 
or are zero. 

16.15 Use your answers to Exercises 16.13 and 16.14 to show that the perturbed Einstein 
field equations yield only the two equations 

<f> + //<f>= ~(f> 0 8(f). 


(« + 2 = 
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16.16 Show that the gauge-invariant Fourier curvature perturbation 


4 = (|) i + 11 


4> o 


satisfies the equation of motion 


? , | vo , 2</> 0 
4+1 


3// 


4+ ^4 = o. 


Defining the new variables dir] = cdt/R and £ k = af l; . where a — R(f> 0 /H, show 
further that 


4' + U 2 -- 4 = o, 


where a prime denotes d/dr] and the ‘effective mass’ = —a"/a is given by 


m 


2 

f 


K WoM K W 

2 RH 2R 2 H 2 4>' 0 ’ 


-2 R 2 H 2 


i’o . <fto , <4 <4 , 3^> 0 ft, \ 

2/F 2 4// 4 i/ 3 ' 2H<f> 0 2H 2 4> 0 J 


16.17 Consider the equation of motion 


(Vend-V) 2 ] 


4 = 0. 


Show that the unique solution (up to a phase factor) that tends to (16.51) for small 
7] is given by 


4 — 


1 i + KVem-V) e - ikv 

Vtf 1?end - V 
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The gravitational field equations give a quantitative description of how the curva¬ 
ture of spacetime at any event is related to the energy-momentum distribution at 
that event. The high degree of non-linearity in these field equations means that 
a general solution for an arbitrary matter distribution is analytically intractable. 
Consequently, thus far we have concentrated primarily on investigating a number 
of special solutions that represent spacetimes with particular symmetries (aside 
from our discussion of perturbations in the previous chapter). In this chapter, 
we return to a more general investigation of the gravitational field equations and 
their solutions. To enable such a study, however, one must make the physical 
assumption that the gravitational fields arc weak. Mathematically, this assumption 
corresponds to linearising the gravitational field equations. 


17.1 The weak-field metric 


As discussed in Sections 7.6 and 8.6, a weak gravitational field corresponds to a 
region of spacetime that is only ‘slightly’ curved. Thus, throughout such a region, 
there exist coordinate systems in which the spacetime metric takes the form 


iV = + V where I ■VI « 1 ’ 


(17.1) 


and the first and higher partial derivatives of h v are also small. 1 Such coordinates 
are often termed quasi-Minkowskian coordinates, since they allow the metric 
to be written in a close-to-Minkowski form. Clearly, h^ v must be symmetric 
with respect to the swapping of its indices. We also note that, when previously 


1 We note that one could equally well consider small perturbations about some other background metric, such 
that g = gp} + h^. This was the case in our discussion of inflationary perturbations in the previous chapter, 
in which was the metric for the background Friedmann-Robertson-Walker spacetime in comoving 
Cartesian coordinates. 
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considering the weak-field limit, we further assumed that the metric was stationary, 
so that = ilohfx,, = 0 where x° is the timelike coordinate. In our present 

discussion, however, we wish to retain the possibility of describing time-varying 
weak gravitational fields, and so we shall not make this additional assumption here. 

As we have stressed many times, coordinates are arbitrary and, in principle, 
one could develop the description of weak gravitational fields in any coordinate 
system. Nevertheless, by adopting quasi-Minkowsian coordinates the mathemati¬ 
cal labour of pursuing our analysis is greatly simplified, as is the interpretation of 
the resulting expressions. If one coordinate system exists in which (17.1) holds, 
however, then there must be many such coordinate systems. Indeed, two differ¬ 
ent types of coordinate transformation connect quasi-Minkowskian systems to 
each other: global Lorentz transformations and infinitesimal general coordinate 
transformations, both of which we now discuss. 

Global Lorentz transformations 
Global Lorentz transformations arc of the form 

x' 11 = A'V 1 ', where 17 ^ = A p ^ A * v ri pa 

and the quantities A p v are constant everywhere. These transform the metric 
coefficients as follows: 

dx a 

g =——— s = A p A °’(tj +h ) = v +A P A a h 

QfLv dx'^ dx' v ^P (T ' , 'P (7 ' U P (T ' , 'P LjV ‘ IV P* rL p°" 

Thus, g' is also of the form (17.1), with 

h' = A P A °7? 

11 [iv 1 v /jl 1 ^v 11 per * 

Moreover, we see from this expression that, under a Lorentz transformation, h /xv 
itself transforms like the components of a tensor in Minkowski spacetime. 

The above property suggests a convenient alternative viewpoint when describ¬ 
ing weak gravitational fields. Instead of considering a slightly curved spacetime 
representing the general-relativistic weak field, we can consider h^ v simply as a 
symmetric rank-2 tensor field defined on the flat Minkowski background space- 
time in Cartesian inertial coordinates. In other words, h pv is considered as a 
special-relativistic gravitational field, in an analogous way to that in which the 
4-potential A describes the electromagnetic field in Minkowski spacetime, as 
discussed in Chapter 6 ; we return to this point below. We note, however, that 
h IJ P does not transform as a tensor under a general coordinate transformation but 
only under the restricted class of global Lorentz transformations; for this reason 
h /J V and tensors derived from it arc sometimes called pseudotensors, although we 
will not use this terminology. 
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Infinitesimal general coordinate transformations 
Infinitesimal general coordinate transformations take the form 


(17.2) 


where the ij v (x) arc four arbitrary functions of position of the same order of 
smallness as the h /xv . Infinitesimal transformations of this sort make tiny changes 
in the forms of all scalar, vector and tensor fields, but these can be ignored in 
all quantities except the metric, where tiny deviations from r]^ contain all the 
information about gravity. From (17.2), we have 


dx'^ 
~dx V 




and, working to first order in small quantities, it is straightforward to show that 
the inverse transformation is given by 2 

- = SZ-d v e. (17.3) 


Thus, again working to first order in small quantities, the metric transforms as 
follows: 

3x^ 

8 ^Spa = (8£ - - B v n(Vpa + v) 

= Vuv + h ,±v - 


where we have defined ^ = rj IJ l f 1 '. Hence, we see that g' is also of the form 
(17.1), the new metric perturbation functions being related to the old ones via 


h[iv L^V 


(17.4) 


If we adopt the viewpoint in which /z v is considered as a tensor field defined 
on the flat Minkowski background spacetime, then (17.4) can be considered 
as analogous to a gauge transformation in electromagnetism. As discussed in 
Chapter 6, if A is a solution of the electromagnetic field equations then another 
solution that describes precisely the same physical situation is given by 

A^ = A„ + S^. 


where i/i is any scalar field. An analogous situation holds in the case of the 
gravitational field. From (17.4), it is clear that if h jXV is a solution to the linearised 


Note that, for the remainder of this chapter, the normal symbol for equality will be used to indicate equality 
up to first order in small quantities as well as exact equality. 




470 Linearised general relativity 

gravitational field equations (see below) then the same physical situation is also 
described by 

h$r ) = h llv -d ll £ v -d v t fl . (17.5) 

In this interpretation, however, (17.5) is viewed as a gauge transformation rather 
than a coordinate transformation. In other words, we arc still working in the same 
set of coordinates and have defined a new tensor whose components in 

this basis are given by (17.5). 

Now that we have considered the coordinate transformations that preserve the 
form of the metric g^ v in (17.1), it is useful to obtain the corresponding form for 
the contravariant metric coefficients g 111 '. By demanding that g lll 'g„ (r = 8&, it is 
straightforward to verify that, to first order in small quantities, we must have 

g^v _ 

where h /lv = Tf ,p rj v<T h p(T . Moreover, it follows that indices on small quantities 
may be respectively raised and lowered using rj 111 ' and 17 rather than g 11 " and 
g pv . For example, to first order in small quantities, we may write 

K = g^Kv = (vT - = rTh^. 

17.2 The linearised gravitational field equations 

In the weak-field approximation to general relativity, one expands the gravitational 
field equations in powers of h v , using a coordinate system where (17.1) holds. On 
keeping only the linear terms, we thus arrive at the linearised version of general 
relativity. The Einstein gravitational field equations were derived in Section 8.4 
and read 

Rfiv 2 gfJ-v^ KTpv 

To obtain the linearised form of these equations, we thus need to find the linearised 
expression for the Riemann tensor R ,T l±vp \ the corresponding expressions for the 
Ricci tensor R /xv and the Ricci scalar R then follow by the contraction of indices. 

To perform this task, we first need the linearised form of the connection 
coefficients T a pLV . To first order in small quantities we have 

r V = k 1 f P ( d v h pr. + d l Jipv- d ph ll v) = l^vh^ + d^K (17.6) 

where we have defined d a = rj ap d p . We may now substitute (17.6) directly into 
the expression (7.13) for the Riemann tensor, namely 

<r = d T a -8 T a + r T r a - r T r a 

/JLPp fip u p* pLP ' A pip 1 TP A puP* Tp 


R' 


(17.7) 
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The last two terms on the right-hand side arc products of connection coefficients 
and so will clearly be second order in h pv ', they will therefore be ignored. Hence, 
to first order, we obtain 

R\v P = kWW + W ~ 5<7 / Vp) - ~ ^V) 

= + - d v d°h pp - d p d p h"), 

which is easily shown to be invariant to a gauge transformation of the form (17.5). 
The linearised Ricci tensor is obtained by contracting the above expression for 
R a fjLvp on its first and last indices. This yields 

V = k (<W + n 2 v - - d p d p hP), (17.8) 

where we have defined the trace h = h‘J r and the d'Alembertian operator D 2 = 
d 0 .<9 CT . The Ricci scalar is obtained by a further contraction, giving 

R = K = = a 2 h - d p d p h^. (17.9) 

Substituting the expressions (17.8) and (17.9) into the gravitational field equa¬ 
tions we obtain the linearised form 

d v d p h+n 2 h pv -d v d p h%-d p d a h^) = - 2 kt pv . (17.10) 

The number of terms on the left-hand side of the field equations has clearly 
increased in the linearisation process. This can be simplified somewhat by defining 
the ‘trace reverse’ of h pv , which is given by 

hpv = hjiP — 2 • 

On contracting indices we find that h = — h. It is also straightforward to show 
that h pv = h pv , i.e. li pv = h pv — \ r) pv h. On substituting these expressions into 
(17.10), the field equations become 


D h pv + Vpv d p d a hP<T ■ 




■W< = 


-2 kT„ 


(17.11) 


These arc the basic field equations of linearised general relativity and arc valid 
whenever the metric takes the form (17.1). Unless otherwise stated, for the remain¬ 
der of this chapter we will adopt the viewpoint that h pv is simply a symmetric 
tensor field (under global Lorentz transformations) defined in quasi-Cartesian 
coordinates on a flat Minkowski background spacetime. 
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17.3 Linearised gravity in the Lorenz gauge 


The field equations (17.11) can be simplified further by making use of the gauge 
transformation (17.5). Denoting the gauge-transformed field by h' pv for conve¬ 
nience, the components of its trace-reverse transform as 

h’w = h'M-lrfPh.' 

= h^ p - d p % p - d p ^ - \t] pp {h - 2d a % a ) 

= h^p-d^^p-d p ^ + ^ p d a ^, (17.12) 


and hence we find that 

dpii'p-p = d p hp- p -n 2 ^. 

Therefore, if we choose the functions ^(x) so that they satisfy 

□ 2 ^ = d p h pp 

then we have d p It 111 ’ = 0. The importance of this result is that, in this new gauge, 
each of the last three terms on the left-hand side of (17.11) vanishes. Thus, the 
field equations in the new gauge become 

= ~ 2kT \v 


Let us take stock of the simplification we have just achieved. Dropping primes 
and raising indices for convenience, we have found that the linearised field 
equations may be written in the simplified form 


U 2 h^ v = -2 kTP- v , 


(17.13) 


provided that the h p,v satisfy the gauge condition 

<y^=o- 


(17.14) 


Moreover, we note that this gauge condition is preserved by any further gauge 
transformation of the form (17.5) provided that the functions satisfy D 2 ^ = 0. 

The above simplification is entirely analogous to that introduced in electro¬ 
magnetism in Chapter 6. In that case, the electromagnetic field equations were 
reduced to the simple form D 2 A P = by adoption of the Lorenz gauge condi¬ 
tion Hjj AP = 0. This condition is preserved by any further gauge transformation 
—»■ + d p ifj if and only if D 2 ;// = 0. As a result of the si mi larities between 

the electromagnetic and gravitational cases, (17.14) is often also referred to as the 
Lorenz gauge. 
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17.4 General properties of the linearised field equations 

Now that we have arrived at the form of the field equations for linearised general 
relativity, it is instructive to consider some general consequences of our lineari¬ 
sation process for the resulting physical theory. The non-linearity of the original 
Einstein equations is a direct result of the fact that ‘gravity gravitates’. In other 
words, any form of energy-momentum acts as a source for the gravitational field, 
including the energy-momentum associated with the gravitational field itself. By 
linearising the field equations we have ignored this effect. 

One may straightforwardly take steps to address this shortcoming by ‘bootstrap¬ 
ping’ the theory as follows: (i) the energy-momentum carried by the linearised 
gravitational field h^ v is calculated; (ii) this energy-momentum acts as a source for 
corrections h^l to the field; (iii) the energy-momentum carried by the corrections 
h'fll is calculated ; (iv) this energy-momentum acts as a source for corrections 
iifl to the corrections and so on. It is widely stated in the literature 3 that, 
on completing this bootstrapping process, one arrives back at the original non¬ 
linear field equations of general relativity, although this claim has recently been 
brought into question. 4 In either case, it is worth noting that this approach allows 
the resulting equations to be interpreted simply as a (fully self-consistent) rela¬ 
tivistic theory of gravity in a fixed Minkowski spacetime. This viewpoint brings 
gravitation closer in spirit to the field theories describing the other fundamental 
forces. Indeed, the remarkable point is that only the field theory of gravitation 
has the elegant geometrical interpretation that we have spent so long exploring. 

Returning to the linearised theory, one result of ignoring the energy-momentum 
carried by the gravitational field is an inconsistency between the linearised field 
equations (17.11) and the equations of motion for matter in a gravitational field. 
Raising the indices /x and v on (17.11) and operating on both sides of the resulting 
equation with d^, one quickly finds that 

d fl T flv = Q. (17.15) 

This should be contrasted with the requirement, derived from the full non-linear 
field equations, that \ 7 /J 7’ /1 " = 0. As was shown in Section 8.8, the latter require¬ 
ment leads directly to the geodesic equation of motion for the worldline x /x (r) of 
a test particle, namely 

F + = 0, (17.16) 

where the dots denote differentiation with respect to the proper time t. Performing 
a si mi lar calculation for the condition (17.15), however, leads to the equation 


3 See, for example, R. P. Feynman, F. B. Morinigo & W. G. Wagner, Feynman Lectures on Gravitation , 
Addison-Wesley, 1995. 

4 See T. Padmanabhan, From Gravitons to Gravity: Myths and Reality , http://arxiv.org/abs/gr-qc/0409089. 
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of motion x 11 = 0, which means that the gravitational field has no effect on the 
motion of the particle. In general, this clearly contradicts the geodesic postulate. 

An alternative way to uncover this inconsistency is to note that an immediate 
consequence of having linearised the field equations is that solutions can be added. 
In other words, if the pairs of tensors {h tlv ) i and (T pv ) i individually satisfy (17.11) 
for i — 1,2 ,... then the quantity „),■ is also a solution, corresponding to 

the energy-momentum tensor Thus, for example, two point masses 

could remain at a fixed separation from one another indefinitely, the resulting 
gravitational field being simply the superposition of their individual radial fields. 

Despite this inconsistency, linearised general relativity is still a useful approx¬ 
imation, provided that we arc interested only in the far field of sources whose 
motion we know a priori and that we arc willing to neglect the ‘gravity of grav¬ 
ity’. In such cases, the effect of weak gravitational fields on test particles can 
be computed by inserting the form (17.6) for the connection coefficients into the 
geodesic equations (17.16). To calculate how the sources themselves move under 
their own gravity, however, one would need to re-insert into the field equations 
the non-linear terms that the linear theory discards. 


17.5 Solution of the linearised field equations in vacuo 


In empty space, the linearised field equations in the Lorenz gauge reduce to the 
wave equation 


□ 2 if v = 0 , 


(17.17) 


with the attendant gauge condition 

<y^=°- 


(17.18) 


It is straightforward to show that the field equations have plane-wave solutions 
of the form 

# H ' = A^exp(ik p x p ), (17.19) 


where the A 11 " arc constant (and, in general, complex) components of a symmetric 
tensor, and the k are the constant (real) components of a vector. Substituting 
the expression (17.19) into the wave equation (17.17) and using the fact that 
d p h pv = kph^, we find that 

□ 2 /i F* = pP a d p dJ^ v = r} pa k p k (T h fJ ' v = 0. 


This can only be satisfied if 


pP^kpk^ = k^kp = 0 , 


( 17 . 20 ) 
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and hence the vector k must be null. Since the linearised Einstein equations only 
take the simple form (17.17) in the Lorenz gauge, we must also take into account 
the gauge condition (17.18). On substituting into the latter the plane-wave form 

(17.19) , we immediately find that the gauge condition is satisfied provided that 
one obeys the additional constraint 

A flv k v = 0. (17.21) 

Thus any plane wave of the form (17.19) is a valid solution of the linearised 
vacuum field equations in the Lorenz gauge, provided that the vector k 11 satisfies 

(17.20) and (17.21). We will discuss plane gravitational waves in detail in the 
next chapter. 

Since the vacuum field equations arc linear (by design), any solution of them 
may be written as a superposition of such plane-wave solutions of the form 

h fJ ' v (x) = J A 111 ' (k) exp(ik p x p ) d 3 k , (17.22) 

where [k^] = (k°, k) and the integral is taken over all values of k. Physical 
solutions arc obtained by taking the real paid of (17.22). 


17.6 General solution of the linearised field equations 

We now consider the general form of the solution to the linearised field equations 
in the presence of some non-zero energy-momentum tensor T IXV . In this case, 
the field equations take the form of an inhomogeneous wave equation for each 
component, 

n 2 h flv = -2 kT^ v , (17.23) 

together with the attendant gauge condition d^li 111 ' = 0. The general solution to 
(17.23) is most easily obtained by using a Green’s function, in a similar manner 
to that employed for solving the analogous problem in electromagnetism. We will 
now outline this approach. 

One begins by considering the solution to the inhomogeneous wave equation 
when the source is a S-function, i.e. it is located at a definite event in spacetime. 
If this event has coordinates y a , one is therefore interested in solving an equation 
of the form 

a 2 x G(x a - y a ) = 8 {4) (x (J - y a ), (17.24) 

where the subscript on □( makes explicit that the d’Alembertian operator is with 
respect to the coordinates x IT and G(x a — y (T ) is the Green’s function for our 
problem, which in the absence of boundaries must be a function only of the 
difference x a — y a . Since the field equations (17.23) are linear, sources that arc 
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more general can be built up by adding further 5-function sources located at 
different events. Thus, the general solution to the linearised field equations can 
be written 5 

h^ix") = h'^ix") -2 k f G(x a - y ,r )T llv (y <T ) d 4 y, (17.25) 

where, for completeness, we have made use of the freedom to add any solution 
h?Z(x) of the homogeneous field equations (i.e. the in vacuo field equations). It 
may be verified immediately by direct substitution that (17.25) does indeed solve 
(17.23). For the discussions in this chapter, however, we will take h^Z(x) = 0 
without loss of generality. 

The problem of obtaining a general solution of the linearised field equations has 
thus been reduced to solving (17.24) to obtain the appropriate Green’s function. 
This may be achieved in a number of ways, and here we shall take a physically 
motivated approach. For convenience, we begin by placing the 5-function source 
at the origin of our coordinate system. We will also make the identifications 
[x M | = (ct, x) and r = |x|. With the source at the origin, we may write (17.24) as 

d^G(x (T ) = 8 {4 \x ,T ). (17.26) 

We first integrate this equation over a four-dimensional hypervolume V. Since 
the spatial spherical symmetry of the problem suggests that the Green’s function 
should only depend on ct and r, we choose the hypervolume to be a sphere of 
radius r in its spatial dimensions and we integrate in t from — oo to oo. The 
geometry of the bounding surface S of the hypervolume is illustrated by the 
vertical cylinder in Figure 17.1, in which the third spatial dimension x 3 has been 
suppressed. Performing the integration of (17.26) over V we obtain 

f d fJL d fl G(x <r ) d 4 x = J [<9 lt G(x <7 )]» M dS = l, (17.27) 

where in the first equality we have used the divergence theorem to rewrite the 
volume integral as an integral over the bounding surface S with unit normal n 
Since we arc working with a metric of signature (+, —, —, —), it should be noted 
that n ,x is chosen to be outward pointing if it is tinrelike and inward pointing if 
spacelike. 

Let us now consider the contributions to this surface integral over S. Since 
gravitational field variations travel at speed c, the only points in spacetime that 
can be influenced by a 5-function source at the origin arc those lying on the 


5 Note that there is no need to include y/—g factors in our integral or delta-function definition, since we are 
considering the problem simply as a tensor field h^ix) defined on a Minkowski spacetime background in a 
Cartesian coordinate system. 
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Figure 17.1 The geometry of the surface S in spacetime used to evaluate the 
Green’s function for the wave equation. The lightcone L emanating from the 
origin is also shown. The x 3 -direction has been suppressed. 


future-pointing part of the lightcone L. Thus G(x a ) must be zero at all points in 
spacetime except those lying on the future lightcone, and so must be of the form 


G(x") 


f(r)8(ct-r) 

0 


for ct > 0, 
for ct < 0, 


(17.28) 


where / is an arbitrary function of r. The intersection of the future lightcone 
with the surface 5 is a sphere (corresponding to a circle in Figure 17.1) of radius 
r lying in the spatial hypersurface ct = r. Thus, the only contribution to the 
surface integral in (17.27) is from this sphere (a circle in the figure), for which 
the (spacelike) unit normal n^ points in the inward spatial radial direction (as 
illustrated). Rewriting the surface integral using dS = cdtdfl (where dil is an 
element of solid angle) and n^d^ = —d r , and performing the integral over the 
spatial sphere, we thus have 


—4nr 2 


L 


dG(x a ) 

dr 


celt = 1, 


(17.29) 


where the only contribution to the integral over t occurs at ct = r. Substituting 
(17.28) into (17.29), we find that 

ATrr 2 f(r) f 8'(ct — r)c dt — Airr 2 ^ f S(ct — r)cdt — 1, (17.30) 

J o dr Jo 

where the prime on the 5-function denotes differentiation with respect to its 
argument. Integration by parts quickly shows the first integral on the left-hand 
side of (17.30) to be zero, whereas the second integral equals unity. We therefore 
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require —Airr 2 clfldr = 1 and so /( r) = 1/(47 tt), where the constant of integration 
vanishes since the Green’s function must tend to zero at spatial infinity. Thus, 
re-expressing the result in terms of the coordinates x a , the required Green’s 
function is 



where the Heaviside function 0(x°) equals unity if x° > 0 and zero if x° < 0. 

We may now use this form to substitute for G(x' T — y a ) in (17.25), with 
set to zero, to obtain 

h IXV (x ,T ) = -— [ g ( (x ~y ^ ~ l X ~- V 0 g(-y0 _ y G ) T ^ v {y a ) d 4 y. 

27 tj \x — y| 

Using the delta function to perform the integral over y°, we finally find that the 
general solution to the linearised field equations (17.23) is given by 

(17.31) 

The interpretation of (17.31) requires some words of explanation. Here 3c repre¬ 
sents the spatial coordinates of the field point at which li 11 " is determined, y 
represents the spatial coordinates of a point in the source and |3c — y| is the spatial 
distance between them. We see that the disturbance in the gravitational field at the 
event ( ct, 3c) is the integral over the region of spacetime occupied by the points 
of the source at the retarded times t x given by 

ct r = ct — 1 3v — v | - 



This region is the intersection of the past lightcone of the field point with the 
world tube of the source. An illustration of the geometric meaning of the retarded 
time is shown in Figure 17.2. 

Although we have shown that (17.31) satisfies the linearised field equations 
(17.23), this form of the field equations is only valid in the Lorenz gauge. We must 
therefore verify that (17.31) also satisfies the Lorenz gauge condition d^h^ 1 ’ = 0. 
Before embarking on this we first remind ourselves how to differentiate a function 
of retarded time. Setting x x = ct r = jc° — |3c — y\, for any function / we have 


d f {xy- >’) _ f df(y°,y) l dx° 
dxf- _ dy° r 'dx >x ’ 

3f{4>y) \ my°ry) ] R/(y°J) 1 

dy‘ _ dy l _ r _ dy° _ r By' ’ 


(17.32) 

(17.33) 
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ct 



Figure 17.2 The disturbance in the gravitational field at the event (ct, x ') is the 
sum of the influences of the energy and momentum sources at the points (ct r , y 1 ) 
on the past lightcone. 


where [ ] r denotes that the expression contained within the brackets is eval¬ 
uated at y° = A'* 1 and where i — 1,2,3. In addition to (17.33), we note that 
df(x®, y)/dy° = 0. 

Let us now verify that the solution (17.31) does indeed satisfy the Lorenz gauge 
condition. Differentiating, we obtain 


dh flv (x 0 , x) 
dx^ 


40 r 

1 

AG r 

J 


1 dT® v (x®,y) d / T iv (x°,y) \ 

\x — y| dx° dx l y |3c — y| J 


1 dT^ (x°, y) 
\x-y\ dxV- 



d 3 y, 


(17.34) 


where we show explicitly that the partial derivatives arc with respect to the 
coordinates x^. Using (17.32), the derivative in the first term of the integrand can 
be rewritten as follows: 


1 

tR 

O U 

3. 

~ dT^ v (y®, y)~ 

dx J? 

’ dT® v (y®,y)~ 


'dT iv (y°,yY 

dx® 

dx^ 

dy° 

r dx^ 

dy® 

r 

dy° 

r <¥' 


where in the second equality we have used the fact that dx®/dx' = —dx®/dy l . 
Returning to (17.34), in a si mi lar manner we may replace d/dx' by —d/dy‘ in 
the second term of the integrand, which then allows this term to be integrated by 
parts, since 




TiV i^y)_ JC f 1 dT iv (x?,y) 

\~* —> | cl S I | | 

\x-y\ J k-v| dy 1 


dry. 
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where S is the surface of the region of intersection between the past lightcone of 
the field point and the world tube of the source and n t is the outward-pointing 
normal to the surface. Moreover, since T lv (x j?,y) vanishes on S, the surface 
integral is zero. 

Combining our results, we may therefore write (17.34) as 


8hAG r 

-1 

o 

o 

_1 


~ dT iv (y°, y)~ 

dx r ° dT"(x?,y )) 

d 3 y 

dx^ c 4 J 

| 5y° 

r 

dy° 

r dy' + dy' j 

|x-y| 


Making use of the result (17.33) to combine the last two terms within the braces, 
we thus arrive at the final form 

dh> lv AG r 1 

dx^ c 4 J \x — y| 


dT^ v (y u ,y) 


dy^ 


d 3 y. 


(17.35) 


As shown in Section 17.4, however, in the linearised theory the energy-momentum 
tensor obeys d^T^ v = 0. Thus the integrand in (17.35) vanishes, and so we have 
verified that the solution (17.31) satisfies the Lorenz gauge condition d fl h flv = 0. 


17.7 Multipole expansion of the general solution 

In general, the source of the gravitational field may be dynamic and have a spatial 
extent that is not small compared with the distance to the point at which one 
wishes to calculate the field. In such cases, obtaining a simple expression for the 
solution (17.31) is often analytically intractable. In an analogous manner to that 
used in electromagnetism, it is often convenient to perform a multipole expansion 
of (17.31), which lends itself to the calculation of successive approximations to 
the solution. One begins by writing down the Taylor expansion 


1 

\x-y\ 


- + (-/)<?,- ( - ) + —(-y l )(-y J )didj ( - ) + ■ 


-+/ -r+yy 


3 XjXj — r 2 8jj 


where r = |3c| is the spatial distance from the origin to the field point and 7, = 
d/dx l . One may then write the solution (17.31) as 


4G 

h^ict, x) = -— 


j J T llv (ct t ,y)d 3 y+^ f T^ v (ct T ,y)y' d 3 y 


+ 


3 XjXj — r 2 8jj 


f T^ v (ct r ,y)y l y J d 3 y- 


(17.36) 
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where ct r — ct — \x — y\. This multipole expansion may be written in a particularly 
compact form: 

x) = ~ t , 

where the multipole moments of the source distribution at any time t are given by 

M> lvl ' i r- i t(ct) = I T^ l '(ct, y)/ 1 / 2 • • • y i( d 3 y. 

Since the fall-off with distance of the term associated with the £th multipole 
moment goes as I /r i+l . the gravitational field at large distances from the source 
is well approximated by only the first few terms of the multipole expansion. 


17.8 The compact-source approximation 


Let us suppose that the source is some matter distribution localised near the origin 
O of our coordinate system. If we take our field point x to be a distance r from 
O that is large compared with the spatial extent of the source, we need consider 
only the first term in the multipole expansion (17.36). Moreover, we assume that 
the source particles have speeds that arc sufficiently small compared with c for us 
to take c? r = ct — r in the argument of the stress-energy tensor. Thus, the solution 
in the compact-source approximation is given by 


h^ v (ct, x) = —I T^^ct — r, y) d 3 y. 


(17.37) 


In this approximation, we arc thus considering only the far-field solution to the 
linearised gravitational equations, which varies as I /r. 

From (17.37), we see that calculating the gravitational field has been reduced 
to integrating T' 1 " over the source at a fixed retarded time ct — r. The physical 
interpretation of the various components of this integral is as follows: 


/ T°° d 3 y, total energy of source particles (including rest mass energy) = Me 2 ; 
/ T 0 ' cl 3 y, cx total momentum of source particles in the x'-direction = P‘c, 
f T‘J d 3 y, integrated internal stresses in the source. 


For an isolated source, the quantities M and P' are constants in the linear 
theory (this is easily proved directly from the conservation equation ( r i /J 7' /J " = 0). 6 
Moreover, without loss of generality, we may take our spatial coordinates x' to 


We shall see later that a source does in fact lose energy via the emission of gravitational radiation, but the 
energy-momentum carried away by the gravitational field is quadratic in and hence neglected in the 
linear theory. 
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correspond to the ‘centre-of-momentum’ frame of the source particles, in which 
case P l = 0. Thus, from (17.37), in centre-of-momentum coordinates we have 

(17.38) 

The remaining components of the gravitational field arc then given by the inte¬ 
grated stress within the source, 

h ij (ct,x) = -^[f T ij (ct',y)d 3 y] , (17.39) 

where [ ] r denotes that the expression in the brackets is evaluated at ct' = ct — r. 

The integral in (17.39) is surprisingly troublesome to evaluate directly. Fortu¬ 
nately, there exists an alternative route that leads to a very neat expression for this 
quantity. We first recall that T 11 " = 0 (where, for consistency with (17.39), we 
arc considering T ,±v as a function of the coordinates (ct', y) and so d 0 = d/d(ct') 
and d k = d/dy k ). From this result, we may write 

d 0 T 00 +d k T 0k = o, ( 17 . 40 ) 

d 0 T i0 + d k T ik = 0. (17.41) 

Let us now consider the integral 

/ d k (T ik yj)d 3 y = f ( d k T ik )yU 3 y + f T r < d 3 y, 

where the integral is taken over a region of space enclosing the source, so that 
T^ v = 0 on the boundary surface S of the region. Using Gauss’ theorem to convert 
the integral on the left-hand side to an integral over the surface S, we find that 
its value is zero. Flence, on using (17.41), we can write 

/ TU 3 y=-j (d k T ik )yU 3 y = f (d 0 T i0 )yjd 3 y=~ / T i0 y> d 3 y. 

For later convenience, interchanging i and j and adding gives 

/ d 3 y =y c -^S {Ti ° yj + Tj ° yi) d ' y - (17 ‘ 42) 

We must now consider the integral 

/ d k (T 0k y y)d 3 y = f (d k T 0k ) y yd 3 y + f (r°V' + T 0j y i ) d 3 y, 

where, once again, we may use Gauss’ theorem to show that the left-hand side is 
zero. Using (17.40), we thus have 

f (T 0i yj + r 0; y) d 3 y — “ / T m y l yU 3 y. 



(17.43) 
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Combining (17.42) and (17.43) yields 

Inserting this expression into (17.39), we finally obtain the quadrupole formula 

(17.44) 

where we have defined the quadrupole-moment tensor of the energy density of 
the source, 

(17.45) 

which is a constant tensor on each hypersurface of constant time. In the next 
chapter, we will use this formula to determine the fat-field gravitational radiation 
generated by a time-varying matter source. 




17.9 Stationary sources 

Let us return to the general solution (17.31) to the linearised field equation. In the 
previous section we confined our attention to the far-f ield solution for a compact 
source. This behaves like 1/r as a function of distance and depends only on 
the mass and inertia tensor of the source. As shown in the multipole expansion 
(17.36), other properties of the source generate a field that falls off more rapidly 
with distance. In general, it is often impossible to obtain a simple expression 
for the solution (17.31). Nevertheless, the solution simplifies somewhat when the 
source is stationary. 

A stationary source has d (} T /J " = 0, i.e. the energy-momentum tensor is constant 
in time. Note that this does not necessarily imply that the source is static (so that 
its constituent particles arc not moving), which would additionally require the 
form of T 111 ' to be invariant to the transformation t —»■ — t. A typical example of a 
stationary, but non-static, source is a uniform rigid sphere rotating with constant 
angular velocity. The main advantage of the stationary-source limit is that the 
time dependence vanishes and thus retardation is irrelevant. Hence, the general 
solution (17.31) to the linearised field equations reduces to 



(17.46) 
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One can perform a multipole expansion of this solution identical to that given 
in (17.36) but for which all time dependence is omitted. Indeed, in this case, it 
becomes somewhat simpler to interpret the various multipole moments physically. 

A particularly interesting special case is the non-relativistic stationary source. 
Consider a source having a well-defined spatial velocity field u‘(x), where the 
speed u of any constituent particle is small enough compared with c that we 
can neglect terms of order u 2 /c 2 and higher in its energy-momentum tensor. In 
particular, we will take y u = (1 — ir/c 2 )~ 1 ^ 2 ~ 1. Moreover, the pressure p within 
the source is everywhere much smaller than the energy density and may thus be 
neglected. From the discussion of energy-momentum tensors in Section 8.1 we 
see that, for such a source, 

r 00 = pc 2 , T 0l — cpu l , T l ! = pit 1 id, 

where p(x ) is the proper-density distribution of the source. We see that 
| T‘j\/\ 7 00 | ~ u 2 /c 2 and so we should take T'i ~ 0 to the order of our approxi¬ 
mation. The corresponding solution (17.46) to the linearised field equations can 
then be written as 

/7 00 =^, h 0i = —, w = 0, (17.47) 

c 1 c 

where we have defined the gravitational scalar potential and gravitational spatial 
vector potential A' by 

<D(3c) = — G [ -^rr d 3 y, (17.48) 

J \x — _y| 

Ax) = - 4 4f P -P^± d 3 y. (17.49) 

c z J \x — y\ 

The corresponding components of h /lv arc given by h llv = id 11 ’ — ^rf' v h. The 
result (17.47) implies that h = h 00 and, on lowering indices, we find that the 
non-zero components are 

(17.50) 

It should be remembered that raising or lowering a spatial (roman) index intro¬ 
duces a minus sign. Thus the numerical value of A t is minus that of A', the latter 
being the zth component of the spatial vector A. The obvious analogy between the 
equations (17.48, 17.49) and their counterparts in the theory of electromagnetism 
will be discussed in detail in Section 17.11. 

For the most part, in this chapter we adopt the viewpoint that h jXV is simply a 
rank-2 tensor field defined in Cartesian coordinates on a background Minkowski 
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spacetime. At this point, however, it is useful to revert to the viewpoint in which 
8)j.v = V/iu + 1' 1 nr defines the metric of a (slightly) curved spacetime. From (17.50), 
we may therefore write the line element, in the limit of the non-relativistic source 
considered here and in quasi-Minkowski coordinates, as 


, ( 2<D\ , , 2A ; • ( 2<F\ _ 

ds l = ( 1 H —y J cl d- -cdtdx — ( 1 - — I ^2(dx ) , 


(17.51) 


in which it is worth noting that A,- dx‘ = — h (/ A' dx J — — A • dx. Determining the 
geodesics of this line element provides a straightforward means of calculating the 
trajectories of test particles in the gravitational field of a non-relativistic source 
(in the weak-field limit). In particular, we note that we need not assume that the 
test particles are slow-moving, and so the trajectories of photons in this limit may 
also be found by determining the null geodesics of the line element (17.51). 


17.10 Static sources and the Newtonian limit 


A special case of stationary sources arc static sources, for which the constituent 
particles are not moving. In this case the only non-zero component of the source 
energy-momentum tensor is the rest energy T 00 = pc 2 , where p(x) is the proper 
density distribution of the source. Indeed, this Newtonian source limit is clearly 
equivalent to a stationary source with a vanishing velocity field u‘(x) = 0. Thus, 
from (17.50), we immediately find that in this case the non-zero elements of 
h,j, v are 


2<t> 

him — h ii — /Z 97 — h-xx — r-. 


(17.52) 


In fact, the above solution remains valid to a good approximation even if the 
source particles arc moving, provided that the source energy-momentum tensor is 
still dominated by the rest energy of the matter distribution, so that | T m \ | T°‘ \ 

and |r 00 | » \T‘j\. 

The line element corresponding to (17.52) is given by 



(17.53) 


where da 2 = dx 2 + dy 2 + dz 2 ", (17.53) is often referred to as the line element in 
the Newtonian limit. Moreover, this line element is easily adapted to allow for 
arbitrary spatial coordinate transformations, since da 2 is simply the line element 
of three-dimensional Euclidean space. Thus if, for example, we adopt spatial 
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spherical polar coordinates then one need only rewrite the spatial line element as 
da 2 = dr 2 + r 2 dO 2 + r 2 sin 2 6 deft 2 . 

It is interesting to compare (17.53) with our discussion of the Newtonian limit 
in Chapter 7, where we considered weak gravitational fields, static sources and 
slowly moving test particles. Under these assumptions, we found that we recovered 
the Newtonian equation of motion for a test particle provided that we made the 
identification /z ()0 = 2d>/c 2 , where d> is the Newtonian gravitational potential. In 
the solution (17.53), we have arrived at the Newtonian limit without making any 
restriction on the velocity of the test particle. This generalisation is important, 
as previously we needed to consider only the effects of the y 00 -componcnt of 
the metric, but, as the above solution shows, the trajectories of relativistic test 
particles and photons also depend on the metric spatial components. 

As an example of the line element (17.53), let us consider the simple case of 
a static spherical object of mass M, so that the Newtonian gravitational potential 
is given by = — GM/r, where r is a radial coordinate. In this case, adopting 
spherical polar spatial coordinates, the line element in the Newtonian limit is 
given by 


ds~ = c~ 



2 GM\ 
c 2 r J 


dt 2 



2 GM\ 
c 2 r ) 


(dr 2 + r 2 dd 2 + r 2 sin 2 6 dejr). 


which is straightforwardly shown to be identical to the Schwarzschild solution, 
to first order in M. In the Solar System, this approximation is sufficiently accu¬ 
rate to determine correctly the bending of light and gravitational redshifts (the 
Shapiro effect) induced by the Sun, giving identical results to those discussed in 
Chapter 10. The accuracy of the above approximation is, however, insufficient to 
predict perihelion shifts correctly. This is not surprising, since perihelion shift is 
a cumulative effect. 


17.11 The energy-momentum of the gravitational field 

Physically, one would expect the gravitational field to carry energy-momentum 
just as, for example, the electromagnetic field does. Unfortunately, the task of 
assigning an energy density to a gravitational field is famously difficult, both 
technically and in principle. From our discussion of the equivalence principle in 
Chapter 7, we know that transforming coordinates to a freely falling frame can 
always eliminate gravitational effects at any one event. As a result, there is no local 
notion of gravitational energy density in general relativity. Moreover, in a general 
spacetime there is no reason why energy and momentum should be conserved. 
In electromagnetism, for example, the conservation of energy and momentum 
for the field is a direct consequence of the symmetries of the Minkowski space- 
time assumed in the theory. In a general spacetime, however, there arc no such 
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symmetries. Even in the linearised gravitational theory developed in this chapter, 
the field h /lv represents a weak distortion of Minkowski space and so the Lorentz 
symmetry properties arc lost. 

Nevertheless, as we have remarked several times, one can also regard the 
linearised theory as describing a simple rank-2 tensor field h j±l , in Cartesian 
inertial coordinates propagating in a fixed Minkowski spacetime background. We 
might therefore hope to assign an energy-momentum tensor to this field just as 
we do for electromagnetism, or any other field theory in Minkowski spacetime. 
As was discussed in Section 17.4, the linearised gravitational theory ignores the 
energy-momentum associated with the gravitational field itself (i.e. the ‘gravity 
of gravity’). To include this contribution, and thereby go beyond the linearised 
theory, one must modify the linearised field equations to read 


where is the linearised Einstein tensor, T 111 ' is the energy-momentum tensor 
of any matter present and t j±l , is the energy-momentum tensor of the gravitational 
field itself. Trivially rearranging this equation gives 


m 87 tG 87 tG 

G (1) H- 1 = - T 

4 l tJ.v 4 1 i xv 


Returning to the exact Einstein equations, however, we may expand beyond first 
order to obtain 


G = G (1) + G + ■ 

W fiv w /XC 1 ^ fJLV 1 


877-G 
- T 

„4 M*” 


where superscripts in parentheses indicate the order of the expansion in h ]XV . This 
suggests that, to a good approximation, we should make the identification 


t = -G (2) 

/H '“ 8t rG '**'■ 


This is also in keeping with our experience of other field theories in Minkowski 
spacetime, such as electromagnetism, in which the energy-momentum tensor is 
quadratic in the field variable. One should not, however, be too firmly guided by 
the analogy with electromagnetism. The reason why the electromagnetic energy- 
momentum tensor is quadratic in the field variable is that the electromagnetic 
field (constituted by photons in the quantum description) does not carry charge 
and so cannot act as its own source. Indeed, this is the physical reason why 
electromagnetism is a linear theory. In the gravitational case, however, one could 
in fact include the higher-order terms in (17.54) in the definition of t^ v ; these 
terms correspond to the contribution to the total energy-momentum arising from 
the gravitational interaction of the gravitational field with itself. Nevertheless, 
when the gravitational field is weak these higher-order terms may be neglected. 
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As one might expect from such an heuristic approach, however, there arc some 
shortcomings of the identification (17.55), which we now outline. The terms in 
the Einstein tensor that are second-order in h v arc given by 


V R (1) -\h, IV R (l) + ±ri llv h p(r RW, 


(17.56) 


( 2 ) 

where R^ v denotes the terms in the Ricci tensor that arc second-order in h and 
R 1 11 and R <2) denote the terms in the Ricci scalar that arc first- and second-order 
in h jxv respectively. Although (17.56), and hence t^ v , is covariant under global 
Lorentz transformations (although not under general coordinate transformations, 
as one might expect), it may be may shown, after considerable algebra, that 
it is not invariant under the gauge transformation (17.5) (or equivalently the 
infinitesimal coordinate transformation (17.4)). One way of circumventing this 
problem is to take seriously the fact that the energy-momentum of a gravitational 
field at a point in spacetime has no real meaning in general relativity, since at 
any particular event one can always transform to a free-falling frame in which 
gravitational effects disappear - . This suggests that, at each point in spacetime, one 
should average G jX {, over a small region in order to probe the physical curvature 
of the spacetime, which gives a gauge-invariant measure of the gravitational 
field strength. Denoting this averaging process by (•••), one should thus replace 
(17.55) by 



(17.57) 


Having made this identification, our task is now an algebraic one of determining 
the form of | G^l j as a function of h lxv . This is rather a cumbersome calculation, 
but the job is made somewhat easier by averaging over small spacetime regions. 
Since we are averaging over all directions at each point, first derivatives average 
to zero. Thus, for any function of position a(x), we have (d^a) = 0. This has the 
important consequence that (3 (ab)) = ((d^a)b) + {a(d^b)) = 0, and hence we 
may swap derivatives in products and inherit only a minus sign, i.e. 

((d fJi a)b) = -(a{d f ,b)). (17.58) 

Let us begin by considering the last two terms on the right-hand side of (17.56), 
which depend on the first-order Ricci tensor and Ricci scalar. It will prove most 
convenient to express these in terms of the energy-momentum tensor T /1V of any 
matter present. The first-order (linearised) field equation (17.11) can be written as 
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where T = T p and k = 87rG/c 4 . We also note from this equation that R (y - = kT. 
Thus, we may write (17.56) as 

Gfr = Rfv - H^ (2) - 5«( V - V^h p< 7 T p(J ). (17.59) 

It therefore remains only to find the form of R p l, from which R l2> may 
be obtained by contraction. The standard expression for the full Ricci tensor 
is obtained by contracting (17.7) on its first and last indices. Thus, the terms 
second-order in h pv are given by 

R {2) = d r (2)£r -8 r (2)CT +r (1)p r (1)<7 -r (1)P r (1)<7 (17 60) 

fiv v /T(7 u O‘ ± l±(J A pv A /X^ A pCT’ yi 1 .kjkjj 

where, on the right-hand side, the superscripts in parentheses denote the order of 
expansion in li pv for the connection coefficients. The connection coefficients to 
first order were calculated in (17.6), and now including the second-order terms 
we have 

r <7 — r ( ' 1 ^ <7 +r (2 ) 0 ’ h_ 

A fiV A pLP 1 A pLV 1 

= i(^ + ax-^V)-2^ T aV + ^-^V) + --- • 

Inserting these expressions into (17.60) and simplifying, one finds after a little 
algebra 

R fl = -\{ d ^ p<J )d v h p(T + ^^(d^d^h^ + d v d a h pp - d p d v h pa - dpdjt^,) 

+ \{d-h p v )(d p K p ~ d a h pp ) + \{d a bT - \d p h){d p h vp + d v h pp - d p h pv ). 

(17.61) 

Although the third group of terms on the right-hand side is not manifestly symmet¬ 
ric in /x and v, this symmetry is easy to verify. In fact, in subsequent calculations 
it is convenient to maintain manifest symmetry by writing out this term again 
with /x and v reversed and multiplying both terms by one-half. 

To evaluate the averaged expression (17.57), we must now calculate 
One first makes use of the result (17.58) to rewrite products of first derivatives in 
(17.61) in terms of second derivatives. Using the first-order field equation (17.10) 
to substitute for terms of the form □ 2 h flv , and then applying (17.58) once more 
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to rewrite terms containing second derivatives as products of first derivatives, one 
finally obtains 

(*$) = I ((<yvK /iP<7 - ^d a hn^K )p +2(d p h)d ilx h p v) - (d^h)d v h 

+K(2y + 2/ I r F - v / I r-4^r;)), (n. 62) 

where we have made use of the symmetrisation notation discussed in Chapter 4. 
Contracting this expression, and once again making use of the result (17.58) and 
the first-order field equation (17.10), one quickly finds that 

{RW) = -\K{hP°T p(T ). (17.63) 

Combining the expressions (17.59), (17.62) and (17.63) and writing the result 
(mostly) in terms of the trace reverse field h j±l , = h jXV — \r) llv h, we thus find that 
the energy-momentum tensor (17.57) of the gravitational field is given by 


(17.64) 


It may be verified by direct substitution that this expression is indeed invariant 
under the gauge transformation (17.5), as required. We shall use this tensor in the 
next chapter to determine the energy carried by gravitational waves. 



Appendix 17A: The Einstein-Maxwell formulation of linearised gravity 

In our discussion of non-relativistic stationary sources in Section 17.9, we found 
that the expressions for the gravitational field exhibited a remarkable similarity to 
the corresponding results in electromagnetism. We now pursue further the analogy 
between linearised general relativity and electromagnetism for non-relativistic 
stationary sources. 

As discussed in Section 17.9, for such a source we may write 

h w = h n = h 22 = h 33 = 2 %^ h 0i = ^, h'i = 0; (17.65) 

c 2 c 


here we denote the gravitational scalar and vector potentials by and A g 
respectively. The linearised field equations may then be written as 



.2 
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where we have defined the momentum density (or matter current density) j = pv. 
These equations have the solutions (17.48, 17.49), which we write as 

9Xx) = -oif^-cr, and A t P) = - 4 -f[M. dX 

J \ x — y| ' c- J \x — y| 

Comparing the above results with the corresponding equations in electromag¬ 
netism for the electric potential and the magnetic vector potential in the absence 
of time-varying fields, there is a direct analogy on making the identifications 

1 1677-G 

^ antl 

The minus signs in these relations arc a result of the fact that the electric force 
repels like charges, whereas the gravitational force attracts (like) masses. Clearly, 
in the electromagnetic case, p and j correspond to the charge and current densities 
respectively, rather than the matter and momentum densities. We can take the 
analogy further by defining the gravitoelectric and gravitomagnetic fields 

£ g = -V<D g and 5 g = VxA g . (17.67) 

Using the equations (17.66), it is straightforward to verify that the fields E g and 
5 g satisfy the gravitational Maxwell equations 




V • E g = — ArrGp, 

v • fig = o, 


- -. 16 ttG- 

Vx£ g = 0, 

V x B g = cl j. 


The equations for E g describe the standard gravitational field produced by a static 
mass distribution, whereas the equations for If, provide a notationally familiar 
means of determining the ‘extra’ gravitational field produced by moving masses 
in a stationary non-relativistic source. 

Although the gravitational Maxwell equations completely determine the gravi¬ 
tational fields produced by a non-relativistic stationary source, they do not deter¬ 
mine the effect of such fields on the motion of a test particle. In electromagnetism 
one must, in addition, postulate the Lorentz force law. From our discussion in 
Section 8.8, however, one might suspect that in the case of gravitation the corre¬ 
sponding force law could be derived rather than postulated. The equation of 
motion for a test particle in a gravitation field is the geodesic equation 

x" + T* 7 l j LV x^x v = 0, 


( 17 . 68 ) 
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where the dots denote differentiation with respect to the proper time t of the 
particle. Let us assume that the test particle is slow-moving , i.e. its speed v 
is sufficiently small compared with c that we may neglect terms in v 2 /c 2 and 
higher. Hence we may take y v = (1 — zr/c 2 ) -1 2 ~ 1. Writing [W ] = {ct, x), the 
4-velocity of the particle may thus be written 

[x M ] = y v (c, v) ~ (c, v). 


This immediately implies that x' T = 0 and, moreover, that dt/dr = 1, so we may 
replace dots with derivatives with respect to t. Thus, the spatial components of 

(17.68) may be written as 


d 2 x' 
dt 2 


(c 2 r' 00 + 2cT‘ 0j v J + FyuV) 


(c 2 r (M + 2cV 0j v J ), (17.69) 


where in the first approximate equality we have expanded the summation in 

(17.68) into terms containing respectively two time components, one time and one 
spatial component, and two spatial components. In the second approximation, we 
have neglected the purely spatial terms since their ratio with respect to the purely 
temporal term c 2 r' 00 is of order v 2 /c 2 . To first order in the gravitational field 
h^ v , the connection coefficients arc given by (17.6). Inserting this expression into 

(17.69) and remembering that for a stationary field = 0, one obtains 


d 2 x l 

dt 2 


\c 2 d'Ko + c ( d'h 0j - djti 0 ) v J = -\c 2 8' J djh m - c8 lk (d k h 0j - djh ok )v J . 


Substituting the expressions (17.65) and remembering that one inherits a minus 
sign on raising or lower a spatial (roman) index, the equation of motion may be 
written as 


d 2 x 
dt 2 


-Vd>g + n x (V x A g ). 


Thus, using (17.67), one obtains the gravitational Lorenz force law 


dfx 
dt 2 


E g + vx B g , 


for slow-moving particles in the gravitational field of a stationary non-relativistic 
source. The first term on the right-hand side gives the standard Newtonian result 
for the motion of a test particle in the field of a static non-relativistic source, 
whereas the second term gives a notationally familiar - result for the ‘extra’ force 
felt by a moving test particle in the presence of the ‘extra’ field produced by 
moving masses in a stationary non-relativistic source. 
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Exercises 

17.1 In a region of spacetime with a weak gravitational field, there exist coordinates in 
which the metric takes the form g= t) -\-h pv . Show that ftis not a tensor 
under a general coordinate transformation. Show further that, to first order in h , 

g^ 


where ft= ri flp ri‘ , ' T h pa . 

17.2 For an infinitesimal general coordinate transformation x rfl = x p + £ v (x), show that 
to first order in the inverse transformation is given by 


dx M 
dx' v 


8^-d v ^. 


17.3 If g pv = r] + /i with | ft | 1, verify that, to first order in ft , 

= \( d v d » h p + d p d ’' h ^-d v d’ T h lip -d p d ll h°), 

R„v = iR^ft + tfft^-^AJ-^ft;). 

/? = n 2 h-d p d^ p . 


Hence show that the linearised Einstein field equations are given by 


d v dh + n 2 h - d„d p K - d p dl< - TJ (D 2 ft - a.^ft 0 ") = —2kT . 


17.4 The trace reverse of h pv is defined by 


Show that ft = — ft and ft = ft Hence show that the linearised Einstein field 
equations in Exercise 17.3 can be written as 

+ v W"' - - WS = - 2 * V 

17.5 Obtain an expression for the covariant components R apvp of the linearised Riemann 
tensor in Exercise 17.3 and show that it is invariant under a gauge transformation 
of the form (17.5). Hence show that the linearised Einstein field equations are also 
invariant under such a gauge transformation. 

17.6 From the linearised Einstein field equations, show that d p T >LV — 0. 

17.7 For a plane gravity wave of the form ft = A pv exp(/A: A x A ), show that the linearised 
Riemann tensor is given by 


I = \(KKKo + k p k Jl<n, - K k uKp - k p k „K v )- 


Hence show that the linearised Ricci tensor is given by 

R p.v = \ ( k v w p + k p w v ~ £ 2/ V-)’ 
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where k 2 — k n k p and w„ — k p h„ n . Hence show that the linearised Einstein field 

H A 4, rr 

equations require that 

k 2 hp V = k v w p + k p w v . 

17.8 From your answer to Exercise 17.7, show that for k 2 ^ 0 one requires /? = 0. 

Hence show that this case does not correspond to a physical wave but merely a 
periodic oscillation of the coordinate system. 

17.9 From your answer to Exercise 17.7, show that for k 2 = 0 one requires k p h pp = 0. 
Hence show that the wavevector k p is an eigenvector of the Riemann tensor in the 
sense that R alJLVp k p = 0. 

17.10 Show explicitly that 

J^ v (x a ) = h^{x a ) -2 kI G(x a - d 4 y, 

is a solution of the linearised Einstein field equations in the Lorenz gauge if 
h^{x°) is any solution of the linearised field equations in vacuo and G(x a — y a ) 
satisfies 

n 2 c G(x°' — y a ) — 5 (4) (x tr — y 0 '). 

17.11 The Green’s function G(x a — y a ) satisfies the equation 

n\G{x a — y 0- ) = S (4) (.r CT — y a ). 


Show that the four-dimensional Dirac delta function can be written as 
8 (4> (x ir -y IT ) = J exp[;T A (x A -/)] d 4 k. 

Hence, by writing the Green’s function in terms of its Fourier transform G(k a ), 
show that 

G(x a -y ,T ) = --^-^ j ^ exp [ik x (x K — y A )] d 4 k, 

17.12 

17.13 

17.14 



where k 2 — k A k A . 

Verify that the solution in Exercise 17.10 satisfies the Lorenz gauge condition. 
Prove the results (17.32, 17.33) for the derivatives of a function of retarded time. 
By writing r = \x\ — ( 8 l jx'x -’)~ 1/2 , show that 


Hence show that 



3 XjXj — r 2 8jj 
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17.15 In Newtonian gravity, the gravitational potential <l>(i) produced by some density 
distribution p(x) is given by 

<1>(x) = -g [ p^d 3 y, 

Jv \x — y\ 

where the integral extends over the volume of the distribution. Show that 

1 1 x-y / 1 \ 

\x-y\ \x\ + |x| 3 + \|x| 3 /' 

Hence show that the gravitational potential can be written as 


<E>(x) = ——- 


GM G d -x 


O — 


where 

M— p(y) d 3 y and d — p(y)yd 3 y. 

Jy Jy 

17.16 From the conservation equation d^T^ 1 ' = 0, show that 

d 0 T 00 + d k T 0k = 0 and d 0 T i0 + d k T ik = 0. 


By integrating each equation over a spatial volume V whose bounding surface S 
encloses the energy-momentum source and using the three-dimensional divergence 
theorem, show that the quantities 

Me 2 = j T 00 d 3 y and P' = J T'° d 3 y 

are constants and give a physical interpretation of them. 

17.17 For a stationary source, show that d 0 T 0 ' — 0. Hence show that 


f (T 0i y j + T 0J y‘) d 3 y = 0, 

Jv 

where the spatial volume V encloses the source. 

17.18 For a non-relativistic stationary source, show that, in centre-of-momentum 
coordinates. 


h°°(x) 


h°‘ (x) 


h ,] (x) 


4 GM 

c 2 \x\ 
2 G 
c 3 |xj 

0 , 


13 J 


J' J + Ol — 


where the quantities M and J 1 ’ are given by 

M = j v P(y ) d 3 y and J' J = ^ [y'p ] {y) -y J p‘(y)] d 3 y. 
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in which p(y) is the proper density distribution of the source and p'(y ) = p(y)u'(y) 
is the momentum density distribution of the source. Give a physical interpretation 
of J ij . 

Hint: You will find your answer to Exercise 17.17 useful. 

17.19 Use your answer to Exercise 17.18 to show that, for a stationary non-relativistic 
source, the gravitational scalar and vector potentials respectively are given to 
leading order in l/|3c| by 

GM - ^ 2 G - „ 

d> g (x) = —— and A(x) =-—r Jxx, 

s |x| S C 2 1JC1 3 


where J — f (y x p) d?y is the total angular momentum vector of the source. 
Show further that these expressions are exact in the linear theory for a spherically 
symmetric source. 

17.20 Use your answer to Exercise 17.19 to show that, in the linear theory, the line 
element outside a spherically symmetric matter distribution rotating about the 
"-axis at a steady rate is given by 


ds 2 — c 2 \ l — 


2 GM 
c 2 r J 


\ , 2 4G7 
1 dr-\ - 


dt 2 -\ — ——(xdy—ydx) dt — 
c 2 r 3 


2 GM 
c 2 r 


( dx 2 + dy 2 + dz 2 ), 


where r — |3c|. Show that this is equal to the Kerr line element to first order in M 
and J. 

Hint: A, dx' = —S /; -A 7 dx' — — A • dx and (J xx) ■ clx = J ■ (x x dx). 

17.21 If g = rj• + h^ v , show that the terms in the Einstein tensor that are second order 
in li^ v are given by 

= R ® - 5V^ (1) + 

where denotes the terms in the Ricci tensor that are second order in h l±in and 
R ‘ 11 and R a> denote the terms in the Ricci scalar that are first and second order 
in h respectively. Show further that this quantity is not invariant under a gauge 
transformation of the form (17.5). 

17.22 Verify that the energy-momentum tensor of the linearised gravitational field is 
given by (17.64). Show further that this tensor is invariant under a gauge trans¬ 
formation of the form (17.5). 

17.23 Use your answer to Exercise 17.19 to show that, in the linear theory, a spherically 
symmetric body of mass M rotating steadily with angular momentum J produces 
gravitoelectric and gravitomagnetic fields given respectively by 

_ GM: _ 2G r- /- il 

Eg{x) = ~W x flgW = ^r 3 rrJ’ 

where x is a unit vector in the x- direction. 

Hint: For any scalar field cf> and spatial vector fields a and b one has S7(4>d) — 
V 4> x a + <p(V x a) and V x (a x b) = a(V • b) — b(V ■ a) + (b ■ V)a — (o • \)b. Also, 
V(l/|x| 3 ) = —3x/|x| 5 . 
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17.24 Consider a particle moving under gravity at speed v in a circular orbit of radius 
r in the equatorial plane of the body in Exercise 17.23. Show that the vector 
acceleration of the particle is given by 

GM a 2 GJv 5 

a — -— r ± ——7- r, 

r 2 c 2 r 3 

where r is the position vector of the orbiting particle and the plus and minus signs 
corresponding to prograde and retrograde orbits respectively. Hence show that the 
angular velocity a> of the particle is given to first order in J by 


, GM 2 GJ 
or = T —rr 
r i rr 



where the minus and plus signs now correspond to prograde and retrograde orbits 
respectively. Thus show that the retrograde orbit has a shorter period than the 
prograde orbit. 

17.25 In electromagnetism, the magnetic dipole moment of a current density distribution 
j(y) is defined by in = | f (y x j)d 3 y, and the force and torque on the dipole 
in a magnetic field B are given by F = (in ■ V)B and T = ikxB respectively. 
Hence deduce that, in linearised gravity, the force and torque exerted by the 
gravitomagnetic field B g on a spinning body with spin angular momentum s are 
given respectively by 

F g = \(s-V)B g and T g =\(sxB g ). 


Thus show that the spin angular momentum of the body will evolve as 


ds 

dt 


\CsxB g ) 


and therefore that s precesses about B g with angular velocity il — — \\B g \ (i.e. in 
the negative sense). This is called the Lens-Thirring precession. 

17.26 A gyroscope is in orbit about the massive rotating body in Exercise 17.23. Use 
your answer to Exercise 17.25 to show that the precessional angular velocity vector 
of the gyroscope is given by 


a = 


j^3(/-x)x — J , 


where x is the position vector of the gyroscope relative to the centre of the massive 
body. Show that this result agrees with that derived in Section 13.20 when x points 
along J. 



18 

Gravitational waves 


In the previous chapter, we saw that the linearised field equations of general 
relativity could be written in the form of a wave equation 

n 2 h^ v = -2 kT^, (18.1) 

provided that the li 111 ' satisfy the Lorenz gauge condition 

8^ = 0. (18.2) 

This suggests the existence of gravitational waves in an analogous manner to that 
in which Maxwell’s equations predict electromagnetic waves. In this chapter, we 
discuss in detail the propagation, generation and detection of such gravitational 
radiation. As in the previous chapter, we will adopt the viewpoint that /i is 
simply a symmetric tensor field (under global Lorentz transformations) defined 
on a flat Minkowski background spacetime. 


18.1 Plane gravitational waves and polarisation states 

In Section 17.5, we showed that the general solution of the linearised field 
equations in vacuo may be written as the superposition of plane-wave solutions 
of the form 

h^ v = A^ v Qx$(ik p x p ), (18.3) 

where the A 111 ' are constant (and, in general, complex) components of a symmetric 
tensor and k jL are the constant (real) components of a vector. The Lorenz gauge 
condition is satisfied provided that the additional constraint 

A^ v k v = 0 (18.4) 
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is obeyed. Physical solutions corresponding to propagating plane gravitational 
waves in empty space may be obtained by taking the real paid of (18.3): 

h pv = exp(ik p x p )\ 

= \A flv exp(ik p x p ) + ^(AH* exp(— ik p x p ), 

which is clearly just a superposition of two plane waves of the form (18.3). 

The constants A 111 ’ arc the components of the amplitude tensor, and the k p = 
r] p ' v k v arc the components of the 4-wavevector. It is conventional to denote 
the components of the 4-wavevector by [k 11 \ = {to/c, k), where k is the spatial 
3-wavevector in the direction of propagation and to is the angular frequency of 
the wave. The nullity of k implies that or = c 2 |k| 2 , and so both the group and 
phase velocity of a gravitational wave arc equal to the speed of light. 

Since A pv = A vp , the amplitude tensor has 10 different (complex) components, 
but the four Lorenz gauge conditions (18.4) reduce the number of independent 
components to six. Moreover, we still have the freedom to make a further gauge 
transformation of the form (17.5), which will preserve the Lorenz gauge provided 
that we choose the four functions £ /J - (x) so that they satisfy D 2 ^ = 0. As we show 
below, this may be used to reduce the number of independent components in the 
amplitude matrix from six to just two. This results in two possible polarisations 
for plane gravitational waves. 

It is convenient to consider the concrete example of a plane gravitational 
wave propagating in the x 3 -direction, in which case the components of the 4- 
wavevector arc 

[k^] = (k,0,0,k), (18.5) 

where k = oj/c. The Lorenz gauge condition (18.4) then immediately gives 
A^ 3 = AH Together with the symmetry of the amplitude tensor, this implies 
that all the components A pv can be expressed in terms of the six quantities 
A 00 , A 01 , A 02 , A 11 , A 12 , A 22 : 


[AH = 


/A 00 a 01 a 02 a 00 \ 
A 01 A 11 A 12 A 01 

A 02 A 12 A 22 A 02 - 

^A 00 A 01 A 02 A 00 y 


We may now perform a gauge transformation of the form (17.5) to simplify the 
amplitude tensor still further. To preserve the Lorenz gauge condition we must 
ensure that D 2 f M = 0. A suitable transformation, which satisfies this condition, is 
given by 


exp (ik p x p ), 
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where the e' 1 arc constants. Substituting this expression into the transformation 
law (17.12) for the trace reverse tensor h lxv , which we assume to be of the form 
(18.3), one quickly finds that the amplitude tensor transforms as 


A» v = A /J.v _ ie n k v _ i€ v k n + i v ^ e P kp _ (18.6) 

Using the expression (18.5) for the 4-wavevector and the result (18.6), we obtain 
A' 00 = A 00 - ik(e° + e 3 ), A' 11 = A 11 - ik(e° - e 3 ), 

A' 01 = A°t — ike 1 , A' 12 = A 12 , 

A' 02 = A 02 _ ike 2, A '22 = A 22 - ik(e° - e 3 ). 

Now, by choosing the constants as follows, 

e° = —/(2A 00 +A 11 + A 22 )/(4k), e 1 = -iA 0l /k, 


e 2 = —iA 02 /k, 


e 3 = -i(2A 


00 ,11 ,22 


A^)/(Ak), 


we obtain 


A' 00 = A' 01 = A' 02 = 0 


and 


A 11 = -A- 


On dropping primes, the first condition means that only A 11 , A 12 and A 22 are 
non-zero. Moreover, the second condition means that only two of these can be 
specified independently. Choosing A 11 = a and A 12 = b as the two independent 
(in general, complex) components in our new gauge, we thus have 


[Ajt] = 


/ 0 0 0 0 \ 

0 a b 0 

0 b —a 0 


V 0 0 0 0 / 


(18.7) 


for a wave travelling in the x 3 -direction. As indicated, the new gauge we have 
adopted is known as the transverse-traceless gauge (or TT gauge), which we will 
discuss in more detail in Section 18.3. For now we simply note that (18.7) implies 
h TT = 0 = h TT (hence the term traceless) and hjj = hjj for our plane wave. 

It is also convenient to introduce the two linear polarisation tensors e^ v and 
e ^ v , the components of which arc obtained by setting a = 1, b = 0 and a = 0, b = 1 
respectively in (18.7). The general amplitude tensor in the TT gauge for a wave 
travelling in the x 3 -direction can then be written as 


A^ 


It follows that all possible polarisations of the gravitational wave may be obtained 
by superposing just two polarisations, with arbitrary amplitudes and relative 
phases. 
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18.2 Analogy between gravitational and electromagnetic waves 

Before going on to discuss gravitational waves in more detail, it is instructive to 
illustrate the close analogy with electromagnetic waves. By adopting the Lorenz 
gauge condition d^A 11 = 0, the electromagnetic field equations in free space take 
the form D 2 A IJ - = 0. These admit plane-wave solutions of the form 

A 11 = 9t[<2 M exp (ik p x p )\, 

where the Q !1 arc the constant components of the amplitude vector. The field 
equations again imply that the 4-wavevector k is null and the Lorenz gauge 
condition requires that Q^k^ = 0, thereby reducing the number of independent 
components in the amplitude vector to three. In particular, if we again consider 
a wave propagating in the .redirection then [/T] = (k, 0, 0, k) and the Lorenz 
gauge condition implies that <2° = Q 3 , so that 

m = 02°, <2\ e 2 , <2°). 

The Lorenz gauge condition is preserved by any further gauge transformation 
of the form A j± —> A /x + d^ iff, provided that □ 2 r// = 0. An appropriate gauge 
transformation that satisfies this condition is 

ip = eexp (ik p x p ), 

where e is a constant. This yields Q r/1 = Q ,A + iek 11 , and so 

Q'° = Q° + iek, Q n = Q\ Q ' 2 = Q 2 . 

By choosing e = — iQ°/k , on dropping primes we have Q° = 0. In the new 
gauge, the amplitude vector has just two independent components, Q l and Q 2 , 
and the electromagnetic fields arc transverse to the direction of propagation. By 
introducing the two linear polarisation vectors 

ef = (0,1,0, 0) and e% = (0,0, 1,0), 

we may write the general amplitude vector as 

Q A = ae^ + be 2 , 

where a and b arc arbitrary (in general, complex) constants. 

If b = 0 then as the electromagnetic wave passes a free positive test charge this 
will oscillate in the x l -direction with a magnitude that varies sinusoidally with 
time. Similarly, if a = 0 then the test charge will oscillate in the x 2 -direction. The 
particular combinations of linear polarisations given by b = ±ici give circularly 
polarised waves, in which the mutually orthogonal linear oscillations combine in 
such a way that the test charge moves in a circle. 
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18.3 Transforming to the transverse-traceless gauge 

In Section 18.1, we considered only the transformation into the TT gauge of a 
plane gravitational wave travelling in the x 3 -direction. We now consider a general 
gravitational perturbation h pv satisfying the empty-space linearised field equation 
and the Lorenz gauge condition. As discussed previously, a gauge transformation 
of the form (17.5) will preserve the Lorenz gauge condition provided that the four 
functions £ M (x) satisfy D 2 ^ = 0. From (17.12), the trace-reverse field tensor 
transforms as 

h^p = ipp - - d p ^ + if-pd 


Since the components h j±v also satisfy the in vacuo wave equation 0 2 h^ v = 0, 
this gauge transformation may be used to set any four linear combinations of the 
h' to zero. The TT gauge is defined by choosing 


hj T = 0 and h TT = 0. 


(18.8) 


This last condition means that hjj = hjj, and these quantities may therefore 
be used interchangeably. Moreover, setting v = 0 and v — j respectively in the 
Lorenz gauge condition d^hjj = 0, and using (18.8), gives the constraints 


d () /? xx = 0 and d ( /r^ T = 0. 


(18.9) 


We note that, if the gravitational field perturbation is non-stationary (i.e. it depends 
on t), as for a general gravitational wave disturbance, the first constraint in (18.9) 
implies that hj\ also vanishes and so h-fj = 0 for all /x. In other words, in this 
case only the spatial components hj T are non-zero. 

Let us now consider the particular case of an arbitrary plane gravitational wave 
of the form (18.3) and satisfying the Lorenz gauge condition. The conditions 
(18.4) immediately imply that 


Ajip = 0 and (A XT )^ = 0. 

Moreover, the conditions (18.9) also require that 

A xx = 0 and A'^kj = 0. 


These last conditions ensure that, quite generally, a plane gravitational wave is 
transverse, like electromagnetic waves. 

The above conditions tell us the constraints on the form of A xx . We must 
now consider how to construct this tensor for a plane wave with a given spatial 
wavevector k and amplitude matrix A 11 ". First, it is clear that we need consider 
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only the spatial components A^ T , since the remaining components arc all zero. 
Moreover, from the above conditions, this spatial tensor must be orthogonal to k 
and traceless. We therefore introduce the spatial projection tensor 


which projects spatial tensor components onto the surface orthogonal to the unit 
spatial vector with components n'. The action of the projection tensor is easily 
illustrated by applying it to an arbitrary spatial vector v'. One quickly finds that 
HjPjV-i = 0 and P[P k v 7 = PjVf as required. In the case of our plane gravitational 
wave, we choose n‘ to lie in the direction of the spatial wavevector, so that n 1 = k‘, 
and thus obtain the components of the spatial amplitude tensor that are transverse 
to the direction of propagation, namely 

A‘ / = P‘ k Pj A kl . 

The trace of this tensor is given by (A T )/ — PkiA kl , which in general does not 
vanish. Using the fact that P\ — 3 — 1 = 2, we may however construct a traceless 
tensor that still remains transverse to k\ this is given by 


Ail T = (P i k Pj- 1 1 P ij P kl )A kl . 


(18.10) 


For a plane gravitational wave travelling in the x 3 -direction, so that [ k jl ] = 
(k, 0,0, k), it is a simple matter to verify that (18.10) produces an amplitude 
matrix of the form 


/0 0 0 0 \ 

0 \(A n -A 22 ) A 12 0 

0 A 12 i(A 22 -A n ) 0 

yO 0 0 Oy 


(18.11) 


which agrees with that given in (18.7). In fact this result illustrates that there is 
a quick and simple algorithm for transforming a plane wave travelling along one 
of the coordinate directions into the TT gauge. We see that the transformation 
(18.11) corresponds to setting to zero all components that are not transverse to 
the direction of wave propagation and subtracting one-half the resulting trace 
from the remaining diagonal elements, to make the final tensor traceless. There 
is, however, nothing special about our choice of x 3 -direction and so the above 
prescription must be true for a plane wave travelling in any of the three coordinate 
directions. 
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18.4 The effect of a gravitational wave on free particles 


Let us now consider the motion of a set of test particles, initially at rest, in the 
presence of a gravitational wave. In fact, in the latter case it is not enough to 
consider the trajectory of just a single test particle, as we discuss below. To obtain 
a coordinate-independent measure of the effects of the wave, it is necessary to 
consider the relative motion of a set of nearby particles. 

First consider a single free test particle, whose 4-velocity u <T must satisfy the 
geodesic equation 


du a 

ch 


+ r° 


i p u v = 0 . 


Suppose that the particle is initially at rest in our chosen coordinate system, so 
that [n M ] = c(l, 0, 0, 0). The geodesic equation then reads 


du a 

dr 


—— —^c 2 r) ap (d 0 h p o + d 0 h Qp — d p h Q0 ), 


where in the last equality we have used (17.6) to obtain the connection coefficients 
to first order in terms of the derivatives of h pv . Let us now adopt the TT gauge, 
which we may do for any general gravitational wave disturbance in vacuo. From 
the discussion in Section 18.3, we know that h J p J = 0 for all values of p. Thus, 
initially, du a /dr = 0 and so the particle will still be at rest a moment later. 
The argument may then be repeated, showing that the particle remains at rest 
forever, regardless of the passing of the gravitational wave. In other words \u ,r \ = 
c(l, 0, 0, 0) is a solution of the geodesic equation in this case, as may readily be 
verified by direct substitution. 

What has gone wrong here? The key point is that ‘at rest’ in this context means 
simply that the particle has constant spatial coordinates. What we have uncovered 
is that by choosing the TT gauge we have found a coordinate system that stays 
attached to individual particles. This has no coordinate-invariant physical meaning. 
To obtain a proper physical interpretation of the effect of a passing gravitational 
wave, we must consider a set of nearby particles. 

Let us therefore consider a cloud of non-interacting free test particles. From 
the above discussion, the worldlines of the particles arc curves having constant 
spatial coordinates. Thus the small spacelike vector = (0, f 1 , £ 2 , £ 3 ) giving 
the coordinate separation between any two nearby particles is constant (this may 
also be shown explicitly by demonstrating that the equation of geodesic deviation 
(7.24) has f /J - = constant as a solution in this case). Although the coordinate 
separation of the particles is constant, this does not mean that their physical spatial 
separation / is constant. The latter is given by 


i 2 = -g i j€ i € i = (8 ij -h ij )€ i p. 
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where not all the h • are constant (in any gauge) and i, j = 1,2, 3. Thus we see 
that the passing of a gravitational wave will indeed cause the physical separation 
of nearby particles to vary. It is convenient at this point to introduce the quantities 

? = ? + \K k £ k . (18.12) 

One then finds straightforwardly that, in terms of these new variables (to first 
order in v ), 

i 2 = s ij C i £ i , 

which is again valid in any gauge. Thus, the tf may be regarded as the components 
of a position vector giving the correct physical spatial separation when contracted 
with the Euclidean metric tensor d (/ . 

Let us now discuss the particular case of a plane gravitational wave propagating 
in the x 3 -direction and consider a set of particles initially at rest in the (x 1 , x 2 )- 
plane, i.e. the plane perpendicular to the direction of wave propagation. Thus, 
the coordinate separation vector between any two particles has t 3 = 0. In the 
TT gauge, however, we see from (18.7) that (/i TT )| = 0, and so (18.12) implies 
that f 3 = 0 throughout the passage of the wave. Hence the particles remain in 
the plane perpendicular to the wave propagation direction; it is only the physical 
separations in the transverse directions that vary. Thus the gravitational wave is 
transverse not only in its mathematical description (hjj) but also in its physical 
effects. 

We first consider the effect of the passage of a gravitational wave with A^ v = 
ae^ v (i.e. a single polarisation), where we take a to be real and positive for 
convenience, and e^ v was introduced at the end of Section 18.1. Remembering 
that hjj = hjj, we thus have 

hjj = ae^ v cos k^x 11 = ae^ v cos k(x° — x 3 ) 
where k = co/c, and using (18.12) we quickly find that 


[C] = (£\£ 2 ,0)-^acosk(x°-x 3 )(£ 1 ,—£ 2 ,0). 

Thus, for two particles initially separated in the x 1 -direction (£* ^ 0) the physical 
separation in the x 1 -direction will oscillate, and likewise for two particles with an 
initial x 2 separation. Let us consider a set of particles that, when cos k(x {] — x 3 ) = 0, 
form a circle in the (x 1 , x 2 )-plane with a reference particle at the centre, with 
respect to which we refer to the other particles, using the -vector components. 
Then, as the wave passes, the particles remain coplanar and at other times have 
spatial separations as illustrated in Ligure 18.1. 
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Figure 18.1 The solid dots show the effect of a plane gravitational wave with 
/+" — ae^ v on a transverse circle of particles. The initial configuration of parti¬ 
cles is shown by the open dots. From left to right, k(x () — x 3 ) is equal to 
2/777, (2// + (2n+ 1)77, (2/7+ |)77 respectively. 



--c 1 

Figure 18.2 The solid dots show the effect of a plane gravitational wave with 
A = be 3 ' on a transverse circle of particles. The initial configuration of 
particles is shown by the open dots. From left to right, k(x° — x 3 ) is equal to 
2// 77 , (2/z + 77 , (2/7 + 1 ) 77 , (2/7+ §(77 respectively. 


We may straightforwardly repeat our analysis for a gravitational wave with the 
other polarisation, i.e. A 111 ' = be'j ' again with real and positive b. In this case one 
finds that 

tri=o) - \bcosk( X °-x 3 )(e, o), 

and this results in our initial circle of particles having spatial separations as 
illustrated in Figure 18.2, which may be obtained from Figure 18.1 by a 45° 
rotation. 

Having determined the relative displacements of test particles induced by the 
two separate polarisations of a plane gravitational wave, it is straightforward to 
find the effect in the general case in which A 111 ’ = ae^ v + be^ v , where a and b 
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f 2 



-f 1 


Figure 18.3 The solid dots show the effect of a plane gravitational wave with 
A^ 1 ’ — a ( ei’’ + ;<?,") (i.e. right-handed circular polarisation) on a transverse circle 
of particles. The initial configuration of particles is shown by the open dots. 
From left to right, k(x° — x 3 ) is equal to 2mr. (2 n+ (2n+ 1)7T, (2n+ ^)77 
respectively. 


may, in general, be complex. Of particular interest are the left- and right-handed 
circularly polarised modes, for which b = —ia and b = ia respectively. The effect 
of, for example, a right-handed circularly polarised wave would be to distort our 
initial circle of particles into an ellipse and to rotate the ellipse in a right-handed 
sense, as illustrated in Figure 18.3. Note that the individual particles do not move 
around the ring but instead execute small circular ‘epicycles’. 


18.5 The generation of gravitational waves 

Let us suppose that we have a matter distribution (the source) localised near the 
origin O of our coordinate system that we and take our field point x to be a 
distance r from 0 that is large compared with the spatial extent of the source. We 
may therefore use the compact-source approximation discussed in Section 17.8. 
Without loss of generality, we may take our spatial coordinates x‘ to correspond 
to the ‘centre-of-momentum’ frame of the source particles, in which case from 
(17.38) we have 

-o 0 = _4GM, = - 0/ = 0 (18.13) 

c^r 

The remaining (spatial) components of the gravitational field are given by 
the integrated stress within the source, which may be written in terms of the 
quadrupole formula (17.44) as 

d 2 I ij (ct')' 
df 2 


2q 

Il‘i(ct, x) =- 7 - 

c b r 


(18.14) 
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In this expression [ ] r denotes that the expression in the brackets is evaluated at 
the retarded time ct' — ct—r, and the quadrupole-moment tensor of the source is 

l lJ (ct) = f T 00 (ct,y)y i y j d 3 y. (18.15) 


Thus, we see that, in the compact-source approximation, the far field of the source 
falls into two parts: a steady field (18.13) from the total constant ‘mass' M of the 
source and a possibly varying field (18.14) arising from the integrated internal 
stresses of the source. It is clearly the latter that will be responsible for any emitted 
gravitational radiation. 

For slowly moving source particles we have T 00 ~ pc 2 , where p is the proper 
density of the source, and so the integral (18.15) may be written as 


I'j(ct) = c 2 J p(ct, x)x'x 2 d 3 x. 


(18.16) 


Thus, the gravitational wave produced by an isolated non-relativistic source is 
proportional to the second derivative of the quadrupole moment of the matter- 
density distribution. By contrast, the leading contribution to electromagnetic radi¬ 
ation is the first derivative of the dipole moment of the charge density distribution. 
This fundamental difference between the two theories may be easily understood 
from elementary considerations. Using p to denote either the proper mass density 
or the proper charge density, the volume integral / p dV over the source is constant 
in time for both electromagnetism and linearised gravitation and so generates no 
radiation. Now consider the next moment / px l dV, i.e. the dipole moment. For 
electromagnetism, this gives the position of the centre of charge of the source, 
which can move with time and hence have a non-zero time derivative: this 
provides the dominant contribution in the generation of electromagnetic radiation. 
For gravitation, however, / px l dV gives the centre of mass of the source and, 
for an isolated system, conservation of momentum means that it cannot change 
with time and so cannot contribute to the generation of gravitational waves. Thus, 
it is the generally much smaller quadrupole moment, which measures the shape 
of the source, that is dominant in generating gravitational waves. This fact, and 
the weak coupling of gravitation to matter, means that gravitational radiation 
is much weaker than electromagnetic radiation. As a corollary, we note that a 
spherically symmetric system has a zero quadrupole moment and thus cannot emit 
gravitational radiation. 

As an illustration of the generation of gravitational waves, let us consider two 
particles A and B of equal mass M moving (non-relativistically) in circular orbits 
of radius a about their common centre of mass with an angular speed FI (see 
Figure 18.4). This might represent a simple model of a binary star system, in 
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Figure 18.4 Two particles A, B of equal mass M rotating at angular speed ft 
in circular orbits of radius a about their common centre of mass. 


which mutual gravitational attraction keeps the particles (stars) in orbit. In this 
case, treating the motion in the Newtonian limit, we require that 


n = 


/ GM\ l/2 

V 4 a^J 


(18.17) 


Alternatively, in a more terrestrial setting, one might imagine the particles to 
be connected by a light rod of length 2 a that is spun with constant angular 
velocity about its centre point, in which case ft need not be related to M and a. 
For simplicity, we shall assume the particle orbits to lie in the plane x 3 = 0, as 
illustrated in Figure 18.4. 

At any time t, the coordinates of particles A and B may be written 

[x^] = (a cos fit, r/sin ft?, 0), [xg] = — {a cos fit, a sin ft?, 0). 


Thus, the proper density of the source is given by 

p(ct, x) = M [<5(x' — a cos ft?) S(x 2 — a sin ft?) 

+ S(x ! +£/cosft?)5(x 2 + nsinft?)] <5(x 3 ). 


On substituting into (18.16) and making use of the standard trigonometric identi¬ 
ties 2 cos 2 ft? = 1 +cos2ft?, 2 sin 2 fit = 1 — cos 2ft? and 2 sin ft? cos fit = sin 2ft?, 
one quickly finds the quadrupole-moment tensor, 


\I lJ (ct)] = Me 1 a 1 


/1 + cos 2ft? 
I sin 2ft? 


V 0 


sin 2ft? 0^ 

1—cos 2ft? 0 

0 0 j 


(18.18) 
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Inserting this expression into the quadrupole formula (18.14) and performing the 
necessary differentiations, we finally obtain 

8 GMa-VL 2 / cos2a ( ?_r / c ) sin 2ft {t-r/c) 0^ 

\h'\ct,x)\ — --- sin2 £l(t—r/c) —cos2 £l(t—r/c) 0 . 

c 4 r 

\ 0 0 0 / 


We note that, for the physical arrangement illustrated in Figure 18.4, the coordi¬ 
nates x' already correspond to the centre-of-momentum frame of the source and 
so the remaining components of h l±l ' arc given by (18.13). 

In fact, one is often only interested in the radiative paid h^ d of the gravitational 
field (i.e. the part corresponding only to gravitational radiation). In general, the 
remaining components of hf ad may be found from the spatial components h l J. 
using the Lorenz gauge condition. For the two-particle system discussed above, 
we see from (18.13) that all the remaining components h^ d arc zero, and so 



/0 0 0 0 \ 

8 GMo 2 D, 2 0 cos 2 £l(t — r/c) sin2 £l(t — r/c) 0 

c 4 r 0 sin 2 fl(t—r/c) — cos 2D,(t — r/c) 0 

^0 0 0 0/ 

(18.19) 


Since the amplitude goes as 1/r, the gravitational perturbation has the form of 
a spherical wave rather than a plane wave. Nevertheless, for large r the wave 
is well approximated by a plane wave in a small range of angles about any 
particular' direction. We also note that the angular frequency of the wave is twice 
the rotational angular frequency of the two particles. 

It is of interest to determine the polarisation of the gravitational waves received 
by observers located in different directions relative to the orbiting particles. To 
do this, one must transform to the TT gauge appropriate to each observer. Let us 
first consider an observer located on the x 3 -axis (at some large distance from O ). 
By comparing with (18.7), we see that (18.19) is already in transverse-traceless 
form for a wave travelling in the x 3 -direction. Remembering that h 1 ^ = hjj and 
using the fact that r = x 3 , it is straightforward to show that 

{ h mdY V - %GM c A r ^ ^ W ~ ie 2l exp 2i£l(t ~ * 3 /c)], (18.20) 

where e^ v and e^ 1 ' arc the linear - polarisation tensors introduced at the end of 
Section 18.1. Since the amplitude tensor has the form A 11 " = ae^ v + be^ v with 
b = — ia, this corresponds to right-handed circularly polarised radiation, as one 
might expect. 
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Let us now consider an observer located on the .v 1 -axis. The form (18.19) is 
not in the transverse-traceless gauge for a wave travelling in the x 1 -direction. To 
transform to the TT gauge, we follow the prescription outlined in Section 18.3. 
We first set to zero all non- transverse components, i.e. all entries except those 
with (i, j) = (2, 2), (2, 3), (3, 2) and (3, 3). We then subtract one-half of the 
resulting trace from the remaining diagonal elements (2, 2) and (3, 3) to make the 
final tensor traceless. Remembering that r = x 1 in this case, and that hj\- = hjj. 
we obtain 




4GMa 2 n 2 


4 GMa 2 D, 2 


0 

0 

— cos2fl(t — r/c) 
0 


91 [—e^ ! 'exp2 iCl(t — x l /c)\ 


° \ 

0 

0 

cos2ft(f — r/c)) 
(18.21) 


where ~e^ v is a linear polarisation tensor analogous to those used above, but for 
propagation in the x 1 - direction. Thus, the gravitational waves received by the 
observer are linearly polarised in the ‘+’ orientation illustrated in Figure 18.1 — 
again as one might have expected. 


18.6 Energy flow in gravitational waves 

Physically, one would expect gravitational waves to carry energy away from a 
radiating source. As discussed in Section 17.11, however, the task of assigning 
an energy density to a gravitational field is notoriously difficult. Nevertheless, 
bearing in mind the caveats made in Section 17.11, from (17.64) an appropriate 
expression for the energy-momentum tensor of the gravitational field in vacuo is 

V = 3^ {('VV)'V^ - 2(dJ/ CT ) VMp - k%Wvh), 

where (• • •) denotes an average over a small region at each point in spacetime. 
If we adopt the TT gauge, however, the Lorenz gauge condition d^hjj = 0 is 
automatically satisfied, and also /? TT = 0 and hjj = hj' v . Thus in this gauge the 
energy-momentum tensor in vacuo reduces to 

c 4 

t,lv ~ 32 -ttG 
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We will assume further that we are considering only the radiative paid of the 
gravitational field, in which case we know from the discussion in Section 18.3 
that hjj = 0, and so 



(18.22) 


In particular, from our discussion of energy-momentum tensors in Section 8.1, at 
any given time and spatial position the energy flux (i.e. the energy crossing unit 
area per unit time) of the gravitational radiation in the unit spatial direction n l is 


F(n) = —ct ok n k , 


(18.23) 


where the minus sign appeal's as a result of our choice of metric signature, since 
then F(n) = —cr] kj t {]k n J = 8 k jt ok tF, as required. 

As an illustration of these general results, let us calculate the energy flux in the 
direction of propagation for a plane gravitational wave of the form 

hj T = Ajj cos k k x k . 


where A^ T are constants and, for convenience, we have chosen the arbitrary phase 
of the wave in such a way that the amplitude matrix is real. Substituting this 
expression into (18.22), and using the fact that (sin 2 (k A x A )) = i when averaged 
over several wavelengths, the energy-momentum tensor reads 

c 4 

tfJjV = 64 77(7 ^ 11 ^^ ^ij • (18.24) 

Thus, the flux F in the k-direction is given by 

F = -ct°% = --— ^k'k^LAj! = — — k°k° A'j T AjJ = ct 00 , (18.25) 

1 64t tG 1 TT 11 64t tG tt l] K J 

where in the third equality we have used the fact that k () — \k\ — —k'kj, since the 
wavevector is null. The final expression is simply the energy density associated 
with the plane wave multiplied by its speed, and hence makes good physical sense 
as the energy flux carried by the wave in its direction of propagation. 

Specialising still further, we may calculate the forms of the expressions (18.24) 
and (18.25) explicitly for a wave travelling in the .redirection, in which case 
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\k^] = (k, 0, 0, —k) where k = co/c and Ajj is given by (18.7). Thus, in this case, 

- 1 \ 

0 
0 
1 


(18.26) 

Clearly, similar results hold for a plane gravitational wave travelling along any 
of the coordinate axes. 

Using the result (18.26) and the expressions (18.20) and (18.21), we find 
that, for the two-particle rotating system considered in the previous section, the 
gravitational-wave energy flux at a (large) distance r in the x 1 - and x 3 - directions 
respectively is 


the energy-momentum tensor (18.24) can be written as 

/ 1 0 0 


[UJ = 


32 t tG 


co 2 (a 2 + b 2 ) 


0 0 0 
0 0 0 
V -1 0 0 


and the flux in the direction of propagation is 

A 


F = 


3277-G 


o) 2 (cr + b~). 


F\ = 


<r 


32 ttG 

A 


F, = 2- 


c~ 


32 ttG 


(2H) 2 

(2 a ) 2 


/4 GMa 2 £l 3 \ 

f= — 1 

(Ma 2 n 3 \ 

V c 4 r ) 

7 TC 5 

\ r ) 


(8GMa 1 ki i \ 

16 G 

(Ma 2 n 3 \ 

\ c 4 r ) 

7 TC 5 

r ) 


Thus, we see that the energy flux in the x 3 -direction is eight times that in the 
x 1 -direction (or, by symmetry, in any direction in the x 3 = 0 plane). Hence the 
energy flux due to the gravitational radiation emitted from this system is highly 
anisotropic. 


18.7 Energy loss due to gravitational-wave emission 

Since gravitational waves carry away energy, we expect energy to be lost at a 
corresponding rate by the physical system generating the gravitational radiation. 
Let us suppose that the source matter distribution is localised near the origin O of 
our coordinates. To calculate the rate at which the physical system loses energy, 
we equate it to the energy flux of the emitted gravitational radiation evaluated 
over a sphere S of large radius r centered on O. Thus, if E is the energy of the 
physical system, we have 


dh/ c 

-77 = -L g w = -r 2 / F(e r ) dCl, 
dt J4n 


(18.27) 
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where L GW is the total gravitational-wave luminosity, F(e r ) is the gravitational- 
wave energy flux at a radius r in the (unit) radial direction e r and dil is an 
element of solid angle. 

In general, using (18.22) and (18.23) we may write the gravitational-wave flux 
in a unit spatial direction n as 


i7 (») = (3 k f^))n k = («• 


where we have made the identification x° = ct and where d, = d/dt. In the second 
equality the operator n ■ V returns simply the rate of change of its argument in the 
direction n. Thus, taking n to lie in the radial direction and writing d r = d/dr , we 
have 

4 

F (e r ) = -^((^^ T ) (Wt t ))- ( 18 - 28 ) 


To obtain a general formula for (18.27), we must calculate the above energy flux 
in terms of properties of the source distribution. From the quadrupole formula 
(18.14) we have 




where /' 7 is the quadrupole-moment tensor of the source distribution defined in 
(18.16), the dots denote d/dt and [ ] r denotes that the expression should be 
evaluated at the retarded time ct r = ct — r. It will, in fact, be more convenient to 
work in terms of the reduced quadrupole-moment tensor of the source distribution, 
which is defined by 

J iJ = Iij-\ 8 i j I ’ ( 18 - 29 ) 


where I = ij is the trace of the original tensor. One immediately sees that 7 (/ 
is simply the traceless version of / (/ . As a result, we may write the transverse- 
traceless paid of the gravitational field tensor as 


h 


ij 

TT 


= h 


ij _ 

TT — 




(18.30) 


where Jj T is the transverse-traceless paid of (18.29). Since at any point on the 
sphere S the direction of gravitational-wave propagation is radial, from (18.10) 
we have 


J i ^=(p[P i i-\P' i P k i)j kl , (18.31) 


where P‘ j = S' 7 — e‘ r e{ is the spatial projection tensor, which projects tensor 
components onto the spatial surface orthogonal to the radial direction at any point. 
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Using (18.30) and the expressions (17.32, 17.33) for the derivatives of time- 
retarded quantities, the derivatives in the expression (18.28) for the energy flux 
are given by 



where, in the second equation, we have retained only the term in 1/r, which 
dominates for large r. Substituting these expressions into (18.28) we obtain 


F(e r ) 


Sirr 2 c 9 


Pvn r]J. 


For convenience, we now use (18.31) to rewrite the product of transverse-traceless 
quadrupole moments in terms of products of reduced moments. Denoting the 
components e\ of the unit radial vector by x\ this yields 

= J U J U - 2jj J ik XjX k + \ f ,] J k, x i x j x k x h 


where we have made use of the fact that J t j is traceless. Thus, the total 
gravitational-wave luminosity is given by 

L ° W = 8^ Ltt (t " J ,J " Jl ' ~ 2 ^ j' lk *i~ Xk + ^ V/] J da - 

Since the reduced quadrupole moment 7 (; is defined as an integral over all space, 
it does not depend on the angular coordinates and so may be taken outside the 
integral. The three remaining integrals are easily evaluated to give 

f f - - 4 tt 

/ r/fl = 4 77, / XfXj dil =—djj, 

JAtt JAtt 3 

f * * * * 477 

/ XiXjX k x,di7l = — (SijS k , + 8 ik 8jj + 8 n 8j k ). 

J4n 1J 

The first result is trivial. The second result may be obtained by noting that 
integration over all angles yields zero for i ^ j, whereas on raising one index and 
setting i = j the integrand becomes x,x' = 1 and so the integral equals 477. Similar 
reasoning leads to the third result. Substituting these three results into (18.27) and 
simplifying, one finally obtains 



(18.32) 
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As an illustration, let us apply the general formula (18.32) to the specific exam¬ 
ple of the two-particle rotating system discussed in Section 18.5. The quadrupole- 
moment tensor /' 7 for this system is given in (18.18), from which we quickly find 
that the reduced quadrupole-moment tensor (18.29) is given by 


[J ij ] = Me 1 a 2 


(| -fcos2nt 


sin212f 

0 


sin2ftf 
5 — cos20r 


The corresponding third time derivative reads 


[J lj ] = 8 MeVn 3 


sin2ftr 

-cos2flr 

0 


-cos 2 fit 
- sin2ftf 
0 




and so (18.32) becomes 
dE G 

— = —L gw =- - (8Mc 2 a 2 Sd 3 ) 2 [2 sirr 2El(t — r/c)+ 2 cos 2 2EL(t — r/c)) 

at 5 c y 

= --^(128 mVH 6 ). (18.33) 

5 c 3 


18.8 Spin-up of binary systems: the binary pulsar PSR B1913 +16 


As discussed in Section 18.5, our simple two-particle rotating system can be used 
to model an equal-mass astrophysical binary system, in which case hi is given by 
(18.17). Inserting this expression into (18.33), we find that the total energy E of 
the binary system obeys 


dE 

dt 


2 G 4 M 5 
5 a 5 


(18.34) 


Treating the binary in the Newtonian limit, the total energy is simply 

1 , GM 2 

E=-(2Mv>)-—, 


where v is the orbital speed of either object. Using the radial equation of motion 
Mv 2 /a = GM 2 /(2a) 2 , we may write 


E = 


GM 2 
4 a 


= -Mv 2 , 


from which we see that the total energy is negative, since the binary system is 
gravitationally bound. Moreover, we note that as E decreases (i.e. becomes more 
negative), according to (18.34) the radius a of the orbit must decrease whereas 
the orbital speed v must increase. Thus, the emission of gravitational radiation 
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causes the binary system to ‘spin-up’, ending ultimately in the coalescence of the 
two objects. 

For comparison with observations of binary systems, the most useful way of 
characterising the spin-up is by the rate of change of the orbital period P. For our 
simple system P = Ittci/v. and so we may write the total energy as 


E 


7 t 2 GM 5 \ 

4 ) P 


(18.35) 


Differentiating this expression with respect to t and inverting, we find that the 
rate of change of the orbital period is related to the rate of change of energy by 


dP _ 3 P dE 

dt 2E dt 


(18.36) 


Substituting a = —GM 2 /4E into (18.34) and then substituting for E using (18.35), 
we find that (18.36) can be written as follows: 


dP 

dt 


4 1/3 ^ 


2 77 -GM 


5/3 


This expression gives the rate of change of the orbital period solely in terms 
of some constants and P itself, which can be determined straightforwardly from 
observations. 

The spin-up of a binary system resulting from the emission of gravitational 
waves has already been observed in the binary pulsar PSR B1913 + 16. This 
system was discovered in 1974 by Hulse and Taylor and consists of a pulsar and 
an unseen companion, each with a mass of about 1.4M G ; the orbital period is 
7.75 hours. The pulsar provides a very accurate clock, so that the change in the 
orbital period as the system loses energy can be measured. In practice, our results 
above have to be modified slightly to allow for the considerable eccentricity of 
the orbit (e = 0.617), but this is relatively straightforward. Timing measurements 
made by Taylor and colleagues over several decades show that the decrease in 
orbital period as a function of time is in agreement with that predicted from 
the emission of gravitational radiation, to within one-third of one per cent. This 
constitutes an additional, and highly accurate, experimental verification of general 
relativity (albeit in the weak-field regime), for which Hulse and Taylor received 
the Nobel Prize in Physics in 1993. 


18.9 The detection of gravitational waves 

Although the measurement of the spin-up of the binary pulsar PSR B1913 + 16 
provides indirect evidence of the existence of gravitational radiation, a major goal 
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of modem experimental astrophysics is to make a direct detection of gravitational 
waves by measuring their influence on some test bodies. 

There arc two distinct approaches to gravitational-wave detection, ‘free-particle’ 
and ‘resonant’ detection. In our discussion in Section 18.4, we found that the 
effect of a gravitational wave on a cloud of free test particles is a variation in 
their relative separations. Thus one may attempt to detect gravitational waves by 
measuring the separations of a set of free test particles as a function of time, 
which is the basis of free-particle detection experiments. Alternatively, if the 
particles arc not free, but are instead the constituent particles of some elastic 
body, then tidal forces on the particles induced by a gravitational wave will give 
rise to vibrations in the body, which one can attempt to measure. In particular, if 
the incident gravitational radiation were in the form of a plane wave of a given 
frequency then the amplitude of the induced vibrations would be enhanced if 
the elastic body were designed to have a resonant frequency close to that of the 
incident wave. This is the basis of resonant detection. 

Resonant detectors arc the older type of realistic gravitational-wave detector, 
having been pioneered by Weber in the early 1960s and refined by him and 
others over several decades. We will concentrate our discussion, however, on 
free-particle gravitational-wave detectors, which have gained in popularity over 
recent years and arc also very much easier to analyse. In our discussion of the 
motion of free test particles in the presence of a passing gravitational wave, we 
showed in Section 18.4 that the relative physical separation / of two free particles 
varies as 

/ 2 = (5y-Ay)^', 

where A is the separation vector between the two particles. In the absence of a 
gravitational wave, the undisturbed distance / 0 between the particles is given by 
/ 0 = Sjjg'gj. To first order in h ir the fractional change in the physical separation 
of the particles is therefore given by 



where n' is a unit vector in the direction of separation of the two particles. Thus, 
we see that the passing of a gravitational wave produces a linear strain, i.e. the 
change in the relative separation of the particles is proportional to their original 
undisturbed separation. For typical astrophysical sources, the largest strain one 
might reasonably expect to receive at the Earth is of order 

io- 21 . 


81 

7 
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Thus, even if the two test masses were separated by a distance / 0 = 1 km, the 
change 81 in this distance is of order I0 l6 cm. which corresponds to ~ 10' 6 of 
the size of the atoms that comprise the test masses! 

Fortunately, laser Michelson interferometers provide a means of measuring 
such tiny changes in the separation of the test masses. The principle of operation 
of such an experiment is quite straightforward and is illustrated in Figure 18.5. 
The basic system of made up of three test masses. Two have mirrors M attached to 
them, and to the third is attached a beamsplitter B. Each mass is suspended from 
a support that isolates the mass from external vibrations but allows it to swing 
freely in the horizontal direction. A laser L (with typical wavelength A ~10 _4 cm) 
is aimed at B, which splits the laser light into two beams directed down the arms 
of the interferometer. The beams arc reflected by the mirrors at the end of each 
arm and then recombined in B before being detected in the detector D. When 
the beams arc recombined they will interfere constructively if the lengths of the 
two arms L { and L 2 differ by an amount A L = n\ and will interfere destruc¬ 
tively if A L = (n + j)A, where n is an integer. The system is arranged so that 
the beams interfere destructively if all three masses arc perfectly stationary. In 
practice, the experimental set-up is more sophisticated than the simple Michelson 



Figure 18.5 A schematic representation of a laser Michelson interferometer 
designed to detect gravitational waves (see the main text for details). 
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interferometer we have discussed. The most important improvement is the intro¬ 
duction of an additional test mass with a partially reflecting mirror P in each arm 
of the interferometer, thereby forming a ‘cavity’, as illustrated in Figure 18.5. 
A typical photon may travel up and down this cavity many times before eventually 
arriving at the beamsplitter, thereby greatly increasing the effective arm length of 
the interferometer. The use of large laser Michelson interferometers as a means 
for attempting to detect gravitational waves is currently being actively pursued 
by a number of laboratories around the world. 


Exercises 

18.1 For a plane gravitational wave of the form h^’’ — A pv exp(ik p x p ), show that, under 

the gauge transformation (17.5) with = e M exp (ik p x p ), the amplitude tensor trans¬ 

forms as 

A' M ’’ = A pv - ie p k v - ie v k p + iTf v e p k p . 

18.2 The trace-reverse gravitational-field tensor transforms as 

h w = h w - d p ^ + rt p d a i a . 

Since the components /; also satisfy the in vacuo wave equation 0 2 h pl , = 0, show 
that this gauge transformation may be used to set any four linear combinations of 
the A' „ to zero. 

18.3 The transverse-traceless (TT) gauge is defined by choosing 

/z°' T = 0 and h TT — 0. 

Hence show that 

<9 0 C = 0 and dJij T — 0. 

18.4 For a plane gravitational wave of the form h pv — A pv exp {ik p x p ), show that the four 
conditions in Exercise 18.3 become 

= 0, (A tt )£ = 0, A°° = 0, A‘-i T kj = 0. 

18.5 Show that the spatial projection tensor P tJ = 5, ; —where n l is a unit vector, 
satisfies the relations 

n [ P‘ ] v’ = 0 and P' k Pj = P^ v 1 , 
and interpret these relations geometrically. 

18.6 The quantities A' 7 are the spatial components of the amplitude tensor for a plane 
gravitational wave with spatial wavevector k 1 . Consider the tensor 

4r = (P i k P i l ~\P ii P k ) A«, 
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where P tj — 5 /; . — k'ti. Show that A^ T is both transverse, so that A'-^kj = 0, and 
traceless. 

18.7 Use your answer to Exercise 18.6 to show that, for a plane gravitational wave 

n, 

0 0 0 \ 

0 0 0 

0 i(A 22 — A 33 ) A 23 

0 A 23 | (A 33 — A 22 )) 

18.8 In the TT gauge show that, to first order in li^, 

r M 00 — o and rv = ^(V)?. 


propagating in the a j -directic 




18.9 Consider two nearby particles, initially at rest in our chosen coordinate system x p ‘, 
which have a coordinate separation given by a small spacelike connecting vector 
= (0, £ 2 , £ 3 ). During the passage of a gravitational wave show that, to 

first order in h^ v in the TT gauge, the equation of geodesic deviation may be 
written as 

= \c 2 (d 0 d 0 or. 


Show further that, to the same order of approximation, in the TT gauge one has 


D 1 ^ 

Dt 2 


d 1 ^ 

dr 2 


+ \c 2 {d 0 d 0 K)?. 


Hence show that = constant is a solution of the geodesic equation, and so the 
coordinate separation of the two particles remains unaltered during the passage of 
the gravitational wave. 

18.10 If C is the spatial coordinate separation vector of two nearby particles, show that 
the square of their physical separation is given by 


/■ 2 = 8 U ST, 

where = i; 1 + \h\£, k . Show that, during the passage of a gravitational wave with 
A^ — be 2 * that is travelling in the x 3 -direction, 

in=(f e^)- \bcosk{x° - x 3 )(e, 


18.11 For two test particles reacting to the passage of a circularly polarised gravitational 
wave, show that one particle moves in a circle with respect to the other. 

18.12 For the two-particle system considered in Section 18.5, verify that 


UCd( ct ’ *)] 


8 GMa 2 n 2 


0 

cos 2 Q(t — r/c) 
sin2fi(t — r/c ) 
0 


0 0\ 

sin 2 Cl(t — r/c) 0 

— cos2f l(t — r/c) 0 

0 0y 


c 4 r 
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and hence show that an observer on the x 3 -axis measures a right-handed circularly 
polarised gravitational wave of the form 

{h^f v = UJM J r a ^ [K" ~ ie D exp 2111(7 - x 3 /c)]. 

18.13 Consider a system of four equal masses attached to the ends of a cross formed 
from massless rods of equal length, set at 90°. If the system rotates freely about 
an axis through the centre of the cross and perpendicular to its plane, show that 
in the far field there is no quadrupole gravitational radiation. 

Hint: Consider the system as the superposition of two systems, each like that in 
Exercise 18.12 but 90° out of phase. 

18.14 For a plane gravitational wave of the form 

llfy = A'fy cos k A x x . 


travelling in the x 3 -direction, verify that the energy-momentum tensor of the 
linearised gravitational field is given by 


LoJ = 


32t tG 


co 2 (a 2 + b 2 ) 


1 0 0 

0 0 0 

0 0 0 

-10 0 


-1 \ 
0 
0 

1 / 


and that the flux in the direction of propagation is 

F — - co 2 (a 2 + b 2 ). 

32ttG 

18.15 For the two-particle system considered in Section 18.5, verify that the gravitational- 
wave energy flux at a (large) distance r is, in the x 1 - and x 3 -directions respectively, 


F, = 


3277-G 


F-, = 2- 


3277-G 


(20) 2 


(20) 2 


/4GMfl 2 n 3 \ 

1 

to 

O 

( Ma 2 CL 3 \ 

V c 4 r J 

7TC 5 

l r J 


/8 GMa 2 Cl 3 \ 


(Ma 2 il 3 \ 

V c 4 /- ) 

7 TC 5 

{ r J 


18.16 If J {4 = (p‘ k Pj - \P”P k ^ J kl and P !J = S ,J - x l xf show that 

JJ/Jtt — ~ 2jj J' k XjX k + -J'ijb'xiXjXbXy. 

18.17 If x' is a unit radial vector, show that 

r 477 r 477 

/ X;Xj dll = —Sjj, / XiXjX k x, dil = — (S lJ S kl + 8 ik 8 jt + S u 8 jk ). 

J4tt J J 4tt 1J 

18.18 For the two-particle system considered in Section 18.5, verify that gravitational- 
wave emission causes the the total energy E of the system to decrease according to 

— = —^-(128MVn 6 ). 

dt 5c 5 V 



Exercises 


523 


18.19 For a binary star system containing two stars of mass M and separation 2a, show 
that the orbital angular speed is 


n = 


/ gmV ,1 


Hence show that gravitational-wave emission causes the the total energy E of the 
system to decrease according to 

dE _ 2 G 4 M 5 

dt 5 a 5 

Thus show that the orbital period P decreases according to 


dP 

dt 


96 ,,, 2ttGM 
-4 1/3 7T - 

5 V P 


5/3 


18.20 Show that, to first order in hjj, the fractional change in the physical separation of 
the particles during the passage of a gravitational wave is 


where n' is a unit vector in the direction of separation of the two particles. 

18.21 Consider a line element of the form 


ds 2 = c 2 dt 2 — dx 2 — f 2 (u ) dy 2 — g 2 (u) dz 2 . 


where f(u) and g(u) are functions of u = ct — x. Calculate the connection coeffi¬ 
cients and hence the Ricci tensor for this line element. Hence show that the line 
element is a solution to the full empty-space field equations R )±l , = 0, provided that 


where a prime denotes d/du. Show that this solution may be interpreted, with no 
approximation, as a linearly polarised plane gravitational wave travelling in the 
x-di recti on. 



19 


A variational approach to general relativity 


Most of classical and quantum physics can be expressed in terms of variational 
principles, and it is often when written in this form that the physical meaning 
is most clearly understood. Moreover, once a physical theory has been writ¬ 
ten as a variational principle it is usually straightforward to identify conserved 
quantities, or symmetries of the system of interest, that otherwise might have 
been found only with considerable effort. Conversely, by demanding that the 
variational principle be invariant under some symmetry, one ensures that the 
equations of motion derived from it also respect that symmetry. In this final 
chapter, we therefore present an introductory account of variational principles 
and the Lagrangian formalism. Our ultimate aim will be to derive afresh the field 
equations of general relativity from this new perspective. This will require us to 
consider some general aspects of classical field theory in flat and curved space- 
times. As a result, this chapter lies somewhat outside the mainstream discussion 
presented in preceding chapters and may be omitted on a first reading. Never¬ 
theless the variational approach that we shall outline is extremely powerful and 
provides the basis for most current research into the formulation of classical (and 
quantum) field theories, including general relativity and other candidate theories 
of gravitation. 


19.1 Hamilton’s principle in Newtonian mechanics 

To begin, let us remind ourselves of a familiar example of a physical varia¬ 
tional principle, namely Hamilton’s principle in Newtonian mechanics. Consider 
a mechanical system whose configuration can be defined uniquely by a number 
of generalised coordinates q a , a = 1 , 2 ,...,/? (usually distances and angles), 
together with time t, and which experiences only forces derivable from a potential. 
Hamilton’s principle states that in moving from one configuration at time t\ to 
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another at time t 2 the motion of such a system is such as to make stationary the 
action 


S = 



, q a ,t) dt. 


(19.1) 


The Lagrangian L is defined, in terms of the kinetic energy T and the potential 
energy V (with respect to some reference situation), by L = T — V. Here V 
is a function of the q a (and possibly t ) only, but not of the q a . As discussed 
in Section 3.19, the coordinates define a configuration space with line element 
ds 2 = g ab dq a clq b . For example, the Lagrangian for a particle of mass m can be 
written as 

L = T — V = \mg ab q a q b — V. (19.2) 

Returning to the general expression (19.1), let us consider an arbitrary variation 
q«(t)^q' a (t) = q«(t) + Sq“(t) 


in the trajectory in configuration space and demand that the corresponding varia¬ 
tion 8S in the action vanishes. Assuming that 8q a (t ) = 0 at the endpoints t l and 
t 2 , we know from our discussion of the calculus of variations in Appendix 3C at 
the end of Chapter 3 that the Lagrangian L must satisfy the Euler-Lagrange (EL) 
equations 


dL 

d i 

( dL \ 

dq a 

~ dt ' 

V dQ a ) 


a = 1,2 


For example, as shown in Section 3.19, the EL equations for the Lagrangian 
(19.2) are 

m(q a + T\ c q b q c ) = -g ab d b V, 


which corresponds to Newton’s second law in an arbitrary coordinate system. 
If the q a (t) arc taken to be the Cartesian coordinates x a (t) of the particle, we 
immediately recover the more familial - form mx a = —8 ah d b V. 

Hamilton’s principle is easily extended from the notion of discrete particles 
to continuous systems. As an example, let us consider a flexible string stretched 
between two fixed points at x = 0 and x = l. In this case, we again have one inde¬ 
pendent time coordinate t, but now in the context of a continuum in which the q a ( t) 
become the continuous variable <j)(t, x) describing the transverse displacement 
of the string as a function of position and time (see Figure 19.1). Consequently, 
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t 



Figure 19.1 The transverse displacement 4>(t,x ) of a taut string fixed at two 
points a distance / apart, viewed as a function in the ( t , x)-plane. 


the expressions for T and V become integrals over x rather than sums over the 
label a. If p(x) and t(x) are the local line density and tension of the string then the 
kinetic and potential energies of the string for small displacements are given by 



Thus, the action (19.1) becomes 

5 = ^ J Ldxdt = J^ J \ [p(d t 4>) 2 — t(< 9 a .^)) 2 ] dxdt, (19.3) 

where in the first equality we have defined the Lagrangian density L and in the 
final expression we have adopted the shorthand d t = d/dt and d x = d/dx. Let us 
now consider an arbitrary variation in the function <fi of the form 

4>(t, x) — > x) = cj)(t, x) + 8(f)(t, x). (19-4) 


This leads to a variation in the action (19.1) given by 


55 = 


Pf 

Jt . Jo 


h J o L 3(3,0) 


dL 0/ d£ 

s ( d t 0 ) + 


d(d x 4>) 


5(d» 


dx dt. 


(19.5) 


From (19.4), one immediately notes that 8(d t (j)) = d t (84>) and 8(d x ip) = d x (84>). 
Substituting these expressions in (19.5) and using Leibnitz’ rule for the differen¬ 
tiation of a product, we may write 


55 = 55 b 



" dL ' 


+ d x 


' dL 11 

J( d A). J 


8cj)dxdt, 


(19.6) 
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where the ‘boundary’ (or ‘surface’) term is given by 



If we assume that the variation is such that 

x) = 0 = x) and 8(j){t, 0) = 0 = 8(f)(t, l) 

then it vanishes on the entire ‘boundary’ of the region of interest in the (t, x)- 
plane, and we have 8S b = 0. Thus, in this case, by demanding that the total 
variation (19.6) in the action vanishes (55 = 0) and using the fact that Sfi is 
arbitrary, we obtain 

' 3C 1 ^ 1 . . 

d t — -- + <L - = dJpd.d)) — d x (Td x d>) = 0, 

where, in the first equality, we have evaluated the derivatives of L with respect 
to d t 4> and d x cj) using (19.3). If, in addition, p and t do not depend on x or t then 

d 2 4> _ 1 d 2 (p 
dx 2 c 2 dt 2 

where c 2 = t/ p. This is the wave equation for small transverse oscillations of a 
taut uniform string. 

19.2 Classical field theory and the action 

In the above discussion, the function 4>(t, x) may be regarded as a ‘field’ defined 
on a two-dimensional space (or manifold) parameterised by the coordinates 
x and t. To extend the idea of a variational principle to a field theory in spacetime, 
one therefore needs only to replace 4>{t,x) by a (finite) set of fields d>"(x /J ‘) 
defined on a four-dimensional spacetime parameterised in terms of some (in 
general) arbitrary set of continuous coordinates x M . Alternatively, one could even 
consider each member of the (finite) set of generalised coordinates q a (t) in (19.1) 
as a ‘field’ defined on a one-dimensional manifold parameterised by the continu¬ 
ous coordinate t, and simply replace the q“ ( t) by the set of fields <b" (x M ). In either 
case, the index a acts merely as a label for the individual fields in the theory. 

This last point is worth clarifying. If, for example, one were considering a field 
theory containing a set of M scalar fields <j> 1 . fi 2 ,, (j> M then the set of fields 
would be simply {(&“} = {0 1 , cj ) 2 ,..., fi M }. Alternatively, one might be interested 
in a field theory containing a vector field (such as electromagnetism). In this 
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case, the label a would run over the four components of the vector field in the 
chosen coordinate system, i.e. we would write {<F a } = {A 0 , A 1 , A 2 , A 3 } = {A^} 
and so a would then be a spacetime index. Si mi lar considerations apply to the 
components of tensor fields. Use of the index a may also be trivially extended to 
label the components of two or more vector or tensor fields involved in the theory. 
Indeed, when considering field theories defined on some arbitrary manifold and 
in arbitrary coordinates, one must always include the metric tensor components 
in the set of fields. For example, in electromagnetism on an arbitrary manifold, 
the full set of fields is in fact {<F a } = {A M , g^ v }. 

By analogy with (19.3), the action S for a set of fields defined on some 
general four-dimensional spacetime manifold should take the form of an integral 
of some function £, called the Lagrangian density, of the fields d> a and their 
first (and possibly higher) derivatives over some four-dimensional region 31 of 
the spacetime. Thus, we take the action integral to be 


s = f £(d>", -yiu, d^ a , ...) d\x, 


(19.7) 


where d 4 x denotes the product of coordinate differentials dx {) dx 1 dx 2 dx 3 . It is 
believed that physical theories should be generally covariant and so this symmetry 
must be reflected in the action S, which therefore has to be a scalar under 
general coordinate transformations. From the discussion in Section 2.14, we know 
that in any arbitrary coordinate system x 11 the invariant volume element (which 
transforms as a scalar field) is d 4 V = ^/—g d 4 x, where g is the determinant of the 
metric tensor in that coordinate system (and is negative for the signature of the 
metric used in this book). It is therefore convenient to write the action (19.7) in 
the form 

S = [ Ly/—gd 4 x, 

Jx 


where we have introduced the field Lagrangian L, which is clearly related to the 
Lagrangian density L by 1 

£ = LV~8- ( 19 - 8 ) 


For the action S to be a scalar, the quantity L s J~gd 4 x must be a scalar field at 
each point in 31. Since the invariant volume element s f zr gd 4 x is already a scalar 
field, then so too must be the Lagrangian L. Taking L to be in general a function 
of the fields <F" and their first (and possibly higher) derivatives, the action for a 


1 Although most authors agree that £ is called the Lagrangian density , it is common in field theory for the 
term Lagrangian (and the symbol L) to mean the integral of £ over some three-dimensional spacelike 
hypersurface, rather than the relationship given in (19.8). We will adopt the convention (19.8) throughout 
this chapter. 
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set of classical fields defined on some 4-dimensional spacetime manifold may be 
written as 

S = j x L(O fl , d^ a , d^ a , .. .)^d 4 x, 


where L is a scalar function of spacetime position. We note finally that the 
Lagrangian density L in (19.8) will not transform as a scalar field under coor¬ 
dinate transformations; in fact, it is what is known as a scalar density of weight 
unity, although we need not concern ourselves here with the definition of such 
objects. 


19.3 Euler-Lagrange equations 

We now derive the form of the field equations for (some subset of) the fields d>" 
by demanding that the action is stationary, or invariant, under small variations in 
(the same subset of) the fields of the form 

dri (x) -+ <F' a (x) = d>" (x) + b<b" (x ). (19.9) 

It is important to note that we arc not performing any coordinate transformation 
here; we are considering only variations in the functional forms of the fields d>" in 
a fixed coordinate system. For simplicity, we shall perform our derivation of the 
field equations under the assumption that the field theory is local, which means 
that second- or higher-order derivatives of the fields do not appear - in the action. 
Thus, we need only consider the consequent variation in the first derivatives of 
the fields, which, from (19.9), is given by 

d^ a - = d^ a + d^ a ). (19.10) 

We also note for later use that, from its definition (19.9), the 5-operator commutes 
with derivatives since 


<V(SO a ) = d^' u - <&") = d^' a - d^“ = 8(d^ a ). (19.11) 

The variations (19.9, 19.10) lead to a variation in the action S —> S + 8S, with 



d£j dL 

-5$" +-S(<? a <h a ) 

d{d^ a ) v ^ 7 


d 4 x. 


(19.12) 


where, for the time being, it is convenient to work in terms of the Lagrangian 
density L defined in (19.8). To derive the field equations, we wish to factor out 
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the variation 5<E a in the second term of the integrand. Using (19.11), this second 
term may be written 


r 8L 
JX 8(8^ a ) ^ 


(S4>“) d 4 x 



— -<5<h' 

d{d^ a ) 

8L 


d 4 x 


8^ a d 4 x, 


where we have integrated by parts (which corresponds simply to rewriting the 
integrand using Leibnitz’ theorem for the derivative of a product). The first 
integral on the right-hand side is a total derivative and can therefore be converted 
into an integral over the bounding surface 831 of the region 31, by straightforward 
calculus. If we restrict the permissible variations to those that vanish on the 
boundary 831, this integral will also vanish and so (19.12) becomes 2 


8S = 


f d 4 x = f 

Jx 8 dr" 3 Jx 


8L 

d(£> a 


dL 


<?(<9 d> a ) 


SdU cfx, 


where, in the first equality, we define the variational derivative 8£/8® a of the 
Lagrangian density with respect to the field . If we demand that the action is 
stationary, so that 8S = 0, under the arbitrary variations we thus require that 


SL 


8L a 8L 
8<& a 11 8(8^ a ) 


(19.13) 


These arc the Euler-Lagrange (EL) equations, which correspond to the field 
equations of the (local) field theory defined by the action S = d 4 x. If, in 
addition, the Lagrangian density depends on second- or higher-order derivatives of 
the fields then the above derivation is straightforwardly generalised. For example, 
if second-order derivatives also appear - then one obtains 


8L 8L 

8L 


8L 

5<jy' 8<b a M 

8(8^“) 

8(8^8^°) 


provided that the variations SO 0 and their first derivatives vanish on the bound¬ 
ary 831. 


The restriction that the variation 8$> a vanishes on the boundary 801 is generally allowable, except when 
discussing topological objects in field theory such as instantons, which are beyond the scope of our discussion. 
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19.4 Alternative form of the Euler-Lagrange equations 

The EL equations in the form (19.13), or generalised to higher-order derivatives 
of the fields, provide a straightforward means of determing the field equations 
corresponding to a given action. In particular, these equations still hold if (some 
of) the fields d> a being varied arc the components of the metric tensor g /JLl , (or 
functions thereof), as will be the case when we derive the Einstein equations from 
the gravitational action in Section 19.8. 

Nevertheless, if the fields (E' being varied arc not functions of the metric tensor 
components then the presence of the ^/—g factor in the Lagrangian density (19.8) 
makes evaluation of the derivative terms in the EL equations (19.13) unnecessarily 
cumbersome, although one will nevertheless arrive at the correct field equations. 
In such cases, however, the Lagrangian L can often be written in terms of the 
fields <E" and their first (and possibly higher-order) covariant derivatives V^d*", 
as opposed to partial derivatives. Indeed, recalling that L should be a scalar 
function of spacetime position, one might expect this to be the case since scalars 
are most easily obtained by contracting tensor indices. Let us therefore repeat our 
derivation of the form of the EL equations, working instead with an action of the 


(19.15) 

where the fields <L" being varied arc independent of the metric tensor g^ v but 
L might still contain g jlv . to raise or lower indices, for example (L might also 
contain the partial derivatives of recall that the covariant derivatives of the 
metric vanish identically). 

Lor simplicity, let us again assume that no second- or higher-order covari¬ 
ant derivatives appear in L. The variation (19.9) leads to variations in the first 
covariant derivatives of the fields given by 

vE a = v M a>* + vsd> a ). (19.16) 

In a similar way to before, we note that the 5-operator commutes with covariant 
derivatives, so that V^Sd*") = 5(V /Lt d>' 7 ). The variations (19.9-19.16) lead, in 
turn, to a variation in the action 5-^5 + 8S, with 

r r 3L 3L 

SS = L Sl ^^ = L[i^ S< s> ‘ + ^ (19.17) 

where we arc now working in terms of the Lagrangian L (as opposed to the 
Lagrangian density £). The partial derivative appearing in the first term of 
the integrand on the right-hand side deserves some comment. In (19.17), we 
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arc treating d>" and <\> a as independent variables. In general, however, the 
covaliant derivatives V /A <h" will contain terms involving the fields d>" multiplied 
by some connection coefficient. If, for some reason, these terms are written out 
explicitly in the Lagrangian, they must not be included when calculating the 
partial derivative of L with respect to the fields <t>". 

As in our previous derivation of the EL equations, we must now factor out the 
variation in the second term of the integrand in (19.17). Using the fact that 
the S-operator commutes with the covariant derivative and employing Leibnitz’ 
theorem for the covariant differentiation of a product, this term may be written 

r 31 r 31 I 

L = L v - H ^ sdx 

r \ 3L ~\ _ 

- / V u - 8^ a J^d 4 x. (19.18) 

Jx '‘[Wj v s v ' 

We may now use the divergence theorem to convert the first integral on the 
right-hand side to an integral over the boundary 3X. The divergence theorem 
reads 

[ (\VnV\i\d 4 x= [ n^V\y\d 3 y, (19.19) 

Jx ->dX 

where V !1 is an arbitrary vector field, y is the determinant of the induced metric 
on the boundary in the coordinates y l (see Section 2.14) and n /x is a unit normal 
to the boundary. Applying this theorem to the first integral on the right-hand side 
of (19.18) and restricting the allowed variations to vanish on 3X, we see 
that this integral is zero. Thus (19.18) becomes 



8<S> a ^d 4 x, 


where, in the first equality, we define the variational derivative 8L/8<& a of the 
Lagrangian with respect to the field d> a . Thus, demanding stationarity of the 
action, 8S = 0, we obtain the alternative form for the Euler-Lagrange equations 

(19.20) 

We shall make use of this form for the EL equations when we consider the field 
theories of a real scalar field in Section 19.6 and electromagnetism in Section 19.7. 
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19.5 Equivalent actions 

From the derivation of the EL equations, (19.13), the alert reader will have noticed 
that there exists an ambigiuity in the definition of the action. This derives from 
the fact that one can always convert the integral of a total derivative over some 
region 31 into an integral over the bounding surface 331. Let us therefore consider 
the following modification of the Lagrangian density: 

(19.21) 

where the Q 11 may, in general, be four arbitrary functions of the fields (but not 
of their derivatives). The corresponding action thus reads 

S = S + Jd ll Q 11 d 4 x. 

The variation in this action under the variation in the fields d>" (19.9) is given by 

SS = SS + f dJSQ 11 ) d 4 .x = SS + f 3 A —5<& a ^ d 4 x, 

Jx p Jx * \ 34> a ) 

where SS is the variation in the original action given by the equation before (19.13) 
and we have used the fact that the 5-operator commutes with derivatives. Since 
the last integral on the right-hand side is a total derivative, it can be converted to 
a surface integral over the boundary 331. Assuming once again that the variations 
5d>" vanish on 331 , this surface integral is zero and so SS = SS. Hence demanding 
that SS = 0 yields the same EL equations as demanding that SS = 0, and the two 
actions arc said to be equivalent. In other words, any two Lagrangian densities 
related by an expression of the form (19.21) lead to the same EL equations. The 
above argument is easily extended to the case in which L contains second- or 
higher-order derivatives of the fields. Lor example, if second-order derivatives 
also appear - in £ then the same EL equations (19.14) will be obtained from any 
Lagrangian density of the form 

£ = £ + 3^(<$> a ,3 v <l> a ), (19.22) 

provided that the variations 5<F a and their first derivatives vanish on the 
boundary 331. 

Despite the appealing features of the above mathematical manoeuvre, the very 
general nature of the allowed transformation (19.21) can lead to problems of 
principle. In particular, we have not constrained in any way the transformation 
properties of the four quantities Q 11 . Thus, we have not ensured that the quantity 
3^ Q /J - d 4 x is a scalar - function under coordinate transformations. Strictly speaking, 
one should ensure that this is true in order that the second term on the right- 
hand side of (19.21) is a scalar - quantity. Without this criterion, the value of this 
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integral (and hence the action 5) is not a scalar, i.e. its value changes depending 
on the choice of coordinates. We shall see in Section 19.9, however, that the 
necessary requirements on the quantities Q 11 are not always imposed. A partial 
defence of such practices is that, as stated earlier, in the variation (19.9) we arc 
not performing any coordinate transformation; we arc considering only variations 
in the functional forms of the fields <J> a in a fixed coordinate system. One might 
therefore be persuaded that the variational formalism outlined above would survive 
the introduction of terms in the action that arc not scalar's under general coordinate 
transformations. In principle, however, such sleight of hand is best avoided, and 
one should always aim to construct an action that is a true covariant scalar - . 

We may also construct equivalent actions when the original action takes the 
form (19.15), remembering that in this case we are assuming that the fields of 
interest <\ >a arc independent of the components of the metric tensor g lxv . Suppose, 
for example, that no second- or higher-order covariant derivatives of the fields 
appear - in L, and consider the new Lagrangian 

L = L + V (a ^(c& a ), 

where the functions Q 11 depend only on the fields and not on their first covariant 
derivatives. The corresponding action then reads 

S = S+f V^y=gd 4 .r, (19.23) 

Jx 

and its variation is given by 

55 = 55 + V^(8Qn^d 4 x = 8S + (^50^ ^d 4 x, 

where 8S is the variation in the original action and again we have used the result 
that the 5-operator commutes with covariant derivatives. Using the divergence 
theorem (19.19), the last integral on the right-hand side can be converted to a 
surface integral over the boundary dX. Assuming once again that the variations 
5d>" vanish on dX, we find that 55 = 55, and so we have obtained the same EL 
equations (19.20) by demanding that 55 = 0 as we did by demanding that 55 = 0. 
We note that, by using the divergence theorem to obtain a surface integral, in 
the present case we require the (U to be the components of a vector. This also 
ensures that V M Q /A is a scalar - field, and so the second term on the right-hand side 
of (19.23) (and hence the total action 5) is a scalar - integral. 


19.6 Field theory of a real scalar field 

The simplest example of a field theory is that of a single real scalar - field (bix 11 ) 
defined on the spacetime. We will also restrict our considerations to a local field 
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theory, so that no second- or higher-order derivatives of the field appeal - in the 
Lagrangian L. 

As a starting point, we take as inspiration the Lagrangian (19.2) for the classical 
motion of a mechanical system in Newtonian mechanics. This Lagrangian is 
expressed in terms of the derivatives of the generalised coordinates q a ( t) with 
respect to the time parameter t, the metric g ab of the configuration space of the 
system and a potential V(q a ). Replacing the generalised coordinates by the field 
< f>(x and time derivatives by derivatives with respect to spacetime position, a 
reasonable choice of Lagrangian is given by 


L=ig^(V^)(V^)-V(^>), 


(19.24) 


where the first term may be loosely regarded as the ‘kinetic energy’ of the 
field and the second term as its ‘potential energy’. In the expression (19.24), we 
have used covariant derivatives rather than partial derivatives since, as stated in 
Section 19.2, L must itself be a scalar function of spacetime position. However, 
since the covariant derivative of a scalar quantity reduces to a partial derivative, 
in this case the latter could be used. Nevertheless, it is usually wiser to retain the 
manifestly covariant notation in (19.24). In particular, we see immediately that 
the corresponding action is given by 


S = f^ V (\*WA)-VW]V=8d*x, (19.25) 

which is of the general form given in (19.15). Varying this action with respect to 
f>, we may therefore use the convenient form of the EL equations given in (19.20). 
For the form of Lagrangian (19.24) we have 


dL _ dV 
df> dfi) 


and 


dL 

<5(V a 0) 


d 

<5(V/>) 


[^(VXV^)], 


where in the second equation we have relabelled the dummy indices in order 
to make the differentiation more transparent. Evaluating this derivative explicitly 
gives 3 

= h p,j [^A)+iy^)K} = K^v^+g^v^)=*rv„0. 


and so the EL equations (19.20) become 


dV 

df> 


-\(g^A) = o. 


With a little practice, derivatives of this sort can in fact be evaluated very quickly, without needing to employ 
the explicit relabelling step used above. 
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Remembering that the covariant derivative of the metric tensor is zero, and 
rearranging, we thus find that the dynamical field equation satisfied by <b is 




(19.26) 


where D 2 = V M „ is the covariant d’Alembertian operator. 

A common choice for the potential is V = \m 2 (f) 2 , where m is a constant 
parameter that characterises the dynamics of the scalar field. The field equation 
(19.26) then becomes 

n 2 4>+m 2 4> = o. 


which is known as the Klein-Gordon equation. Upon quantisation (which is 
beyond the scope of our discussion), this field theory describes collections of 
neutral spinless particles of mass m that do not interact with each other except 
through their mutual gravitational attraction. 


19.7 Electromagnetism from a variational principle 

As discussed in Chapter 6, electromagnetism may be described in terms of the 
vector field A Thus, using the general description given in Section 19.2, the 
fields d>" (a = 1,..., 4) being varied arc the components of this vector field and 
so a is a spacetime index. To describe the dynamics of the electromagnetic field 
in terms of the variational principle, again we begin by constructing a Lagrangian 
L which is a function of and its first derivatives and which behaves as a scalar 
field under general coordinate transformations. We will work from the outset 
assuming arbitrary coordinates. 

In the case of electromagnetism, however, we saw in Chapter 6 that the theory 
also possesses a gauge invariance. If A describes the electromagnetic field in 
some physical situation then the same situation is also described by any other 
field of the form 

^V = Au + V' = A*+ ( V/'’ ( 19 - 27 ) 

where ijj is any scalar field (the last equality holds because the covariant derivative 
of the scalar is simply its partial derivative). As discussed earlier, by demanding 
that the action be invariant under some symmetry one ensures that the resulting 
equations of motion also respect that symmetry. We must therefore make sure 
that the action is invariant under the gauge transformation (19.27). This precludes 
us from forming scalars depending on A^A 11 , since it is easy to show that this 
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expression is not gauge invariant. Nevertheless, the electromagnetic field-strength 
tensor 

= \A V - V„A p = d^A v - d v A lx (19.28) 

is easily shown to be gauge invariant; the second equality in (19.28) holds since 
a convenient cancellation occurs between the terms containing connection coef¬ 
ficients arising from the two covariant derivatives. The most obvious scalar to 
be constructed from the field-strength tensor is simply F /iV F llv = g^ p g va F pa F^ v . 
Including a factor of — 1 /(4/x 0 ) for later convenience, we shall take the ‘free-field’ 
part of the Lagrangian to be 

U = —- V (7 A p )(V / ,A, - V,A p ), 

4/X 0 

where, again for later convenience, we have written the expression in terms of 
covariant derivatives rather than partial derivatives. 

So far we have not taken into account that the source of the electromagnetic 
field is the 4-current density f x of any charged matter present. To describe this, 
we must include an ‘interaction term’ in the Lagrangian. The most straightforward 
scalar we may construct from the electromagnetic field and the current density is 
j p A^, and we will take the interaction term to be L ; = —j p A p . Taking the full 
Lagrangian to be L = L f + L { , the action reads 



7 —g^fTFoK - K\)(\A V - V„A„) - fA. 


L 4/^o 


V 3 gd 4 x. 


(19.29) 


As is immediately apparent, however, the interaction term —Jf-A jX is not auto¬ 
matically gauge invariant. Under the gauge transformation (19.27) the correspond¬ 
ing term in the action becomes 

- jjfA^ + riV^n^gd^ = - jlfA^ + - (V p /),A]y=gr/ 4 v. 

Using the divergence theorem (19.19), we may write the second term in the 
integrand on the right-hand side as a surface integral over the boundary dX. 
Taking the source to vanish on dX (by, for example, taking the boundary to 
be at spatial infinity), the surface integral is zero. Thus, we see that the paid of 
the action arising from the interaction term is, in fact, gauge invariant, provided 
that the source f 1 satisfies the covariant continuity equation 

V^ = o, 

and so the requirement of gauge invariance implies the conservation of charge. 
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Thus, under the appropriate conditions, the action (19.29) is invariant under 
the gauge transformation (19.27) and, by construction, is a scalar under general 
coordinate transformations. Let us now determine the Euler-Lagrange equations 
resulting from varying the fields A jX in this action (while keeping the source 
j 11 fixed). From (19.29), we see that the action has the general form (19.15). 
Therefore we may once again use the form of the EL equations given in (19.20), 
which in this case read 


dL 

Ja v 


dL 

d(y„A v ) 


= 0 . 


For the action in (19.29), we have immediately 


dL 

dA v 




(19.30) 


(19.31) 


but evaluation of the second term on the left-hand side of (19.30) requires more 
care. Relabelling dummy indices, and writing V /x A v — V r A /x = F )JLV for conve¬ 
nience, we have 


dL 

W^Av) 


d r i 

d(VuA„) _ 4/r 0 
1 

4/r 0 


ap ficr 


S'' F'pAa/3 


[(«?«£ - s;sy F„ p +- SIK\ 


4/r 0 P 4/r 0 


1 

4/r 0 




1 

4/x 0 


(jw v -F 1 ’ 11 ) = - 


—F^, 

Vo 


where in the last equality we have used the antisymmetry of the field-strength 
tensor (19.28). Combining this result with (19.31), the EL Lagrange equations 
(19.30) read 


V^ = /r 0 /, 


which is the same expression as that for the inhomogeneous Maxwell equations in 
an arbitrary coordinate system, given in Section 7.7. The remaining homogeneous 
Maxwell equations are in fact automatically satisfied from the definition (19.28) 
of the field-strength tensor, since 


V f +VF +V F =d F +d F +d F =0 

v cr* fJLP 1 y v L cr/x ' v /x I va u a A fiv ' oyx ' w /jl £ va 

Of course, one may object to the fact that we carefully constructed the action 
(19.29) (by, for example, including specific factors in L f and L ; ) in such a way that 
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its variation with respect to A /a led to the field equations for electromagnetism. 
Nevertheless, the derivation above illustrates the natural way in which the action 
approach constrains the possible forms for the theory and allows any symmetries 
in the theory to be made manifest. 


19.8 The Einstein-Hilbert action and general relativity in vacuo 

We now use our experience in expressing scalar field theory and electromagnetism 
as variational principles to construct an action for gravitation from which the 
Einstein field equations of general relativity can be derived. For the time being, 
we will restrict our attention to general relativity in vacuo. 

To construct an action for general relativity, we must define a Lagrangian L 
which is a scalar under general coordinate transformations and which depends on 
the components g /lt , of the metric tensor (these arc now the dynamical fields), and 
their first- and possibly higher-order derivatives. The simplest non-trivial scalar 
that can be constructed from the metric and its derivatives is the Ricci scalar R. 
which depends on g and its first- and second-order derivatives. In fact, R is 
the only scalar derivable from the metric tensor that depends on derivatives no 
higher than second order. From our knowledge of gravitation as a manifestation 
of spacetime curvature, we might also expect L to be derived from the curvature 
tensor. Thus, in searching for the simplest plausible variational principle for 
gravitation, one is immediately led to the Einstein-Hilbert action 



(19.32) 


Since the corresponding Fagrangian L EH = R now depends on the elements 
of the metric tensor, it is more convenient to work in terms of the Fagrangian 
density £ EH = R^J—g. The resulting EF equations thus take the form (19.13), 
which in this case reads 


dL 

_ g 

dL 

+ dpd ( j 

dL 


_ d ( d a8pv)_ 

K d p d *gp,v) _ 


Unfortunately, the task of evaluating each term in the above equation involves 
a formidable amount of algebra, albeit straightforward. We shall therefore not 
pursue this approach any further. Instead, we shall derive the corresponding field 
equations by considering directly the variation in the action resulting from a 
variation in the metric tensor. 

Fet us therefore consider a variation in the metric tensor given by 
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where 8g pv and its first derivative vanish on the boundary dX of the region X. It 
will prove useful also to determine the corresponding variation 8g^ v in the inverse 
metric components. This is most easily achieved by noting that g fJ ' p g pv = 8„ and 
using the fact that the constant tensor 8„ does not change under a variation. To 
first order in the variation, one may therefore write 

8g^g pv + g^ p 8g pv = 0. (19.33) 

Multiplying through by g v<T , relabelling indices and rearranging, one obtains 

h ,lv = -g^g va 8 gp(r . 

Writing the Ricci scalar as R = g pv R pv , the first-order variation in the Einstein- 
Hilbert action (19.32) can be written as 

SS EH = fjg^ v R pv ^gd 4 x + J x ^SR flv ^=gd 4 x + g^R^v 8(^/—g) d A x 

= S5' 1 + S5 2 + 55 3 . (19.34) 

To derive the field equations, we need to factor out the variation 8g^ v in the 
second and third integrals. Let us first focus on the second term and write the 
variation 8R pv in terms of the variation 8g llv in the metric tensor. It is in fact 
more illuminating, and no more work, to determine the variation 8R 17 pvp in the 
full curvature tensor, from which the corresponding variation in the Ricci tensor 
can be obtained immediately by contraction. The curvature tensor is given by 

n(T _ 3 p<7" _ 3 ro - _j_ pT per pT per 

iV fivp U V L pip U p L piV ' 1 /Xp A TV 1 pLV 1 Tp * 

Let us first consider the variation in the curvature tensor resulting from an arbitrary 
variation in the connection coefficients, 

per _ per , <sper 

A pLV ^ A p,v ' p,v 

It is worth noting that the variation 8\' (r pi , is the difference of two connections 
and is therefore a tensor. As is often the case in proving tensor identities, it is 
easiest to work in local geodesic coordinates at some arbitrary point P. In such a 
coordinate system pv (P) = 0, and so at the point P we have 

SR% vp = d v (8T\ p )-d p (8T\ v ). 

Moreover, partial derivatives and covariant derivatives coincide at P and so 

SR% vp = v, (sr% p ) - v p (srv) • (19.35) 

We now see, however, that the quantities on the right-hand side arc tensors, and 
therefore (19.35) holds not only in geodesic coordinates at P but in any arbitrary 
coordinate system. Since the point P was chosen arbitrarily, the result (19.35) 
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thus holds generally and is known as the Palatini equation. The corresponding 
variation in the Ricci tensor is obtained by contracting on a and p in (19.35) 
to give 

« V = V„ (fir V) - V. ( 8 T V) ■ (19-36) 

We may therefore write the second term on the right-hand side of (19.34) as 

ss 2 = (r [v, (srv) - v. (srv)] V^^x 

= j x v, (^fir v - sr v) V=g^x, 

where in the last line we have used the fact that the covariant derivative of the 
metric vanishes and we have relabelled indices in the second term of the integrand. 
Using the divergence theorem (19.19), however, we may write 8S 2 as a surface 
integral over the boundary dX, which vanishes provided that the variation in the 
connection vanishes on the boundary. This means that variations in the metric 
tensor and in its first derivatives vanish on dX. 

Let us now turn our attention to the third term 8Si, in (19.34), in which we 
must express 8^—g in terms of the variation 8g /lv . Recalling that g = dct[ " M ,, |, 
we note that the cofactor of the element g /JLl , in this determinant is gg . It follows 
that 

8g = gg^Sg^ = —gg flv 8g IJ ' v , 

where in the second equality we have used the result (19.33). Thus, we have 

8^g = -\{-g)- l/2 8g= -\^-g glJLV 8g^ ■ (19-37) 

Substituting this expression into the third term 8S 3 in (19.34) and remembering 
that 8S 2 = 0, we finally discover that the variation in the Einstein-Hilbert action 
may be written as 

SS EH = J x (V - \ gllv R) 8g* v J=gd*x. (19.38) 

By demanding that SS EH = 0 and using the fact that the variation 8g 111 ' is arbitrary, 
we thus recover Einstein’s field equations in vacuo : 

(19.39) 

This is an impressive result, since we have obtained the field equations of general 
relativity by varying an action (19.32) to which we were led very naturally 
on the grounds of symmetry and simplicity. This illustrates the power of the 
variational approach and should be contrasted with the more heuristic approach 
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we had to employ in Section 8.4. Moreover, if one were willing to consider more 
complicated actions, the variational formalism suggests how Einstein’s theory 
might be modified by adding to the Lagrangian terms proportional to R 2 , R \ etc. 
The formalism also provides a means for investigating alternative gravitational 
Lagrangians. For example, the choice L = R jll , /l(r R l>lll ' IJ leads to an alternative 
self-consistent theory of gravity considered by Eddington. 


19.9 An equivalent action for general relativity in vacuo 

The Einstein-Hilbert action (19.32) differs from the action (19.25) for scalar field 
theory and the action (19.29) for electromagnetism in that it depends on second- 
order derivatives of the dynamical fields. It is therefore of interest to consider 
whether the empty-space gravitational field equations can be derived from an 
action that depends only on the metric tensor and its first derivatives. As stated 
in the previous section, however, R is the only scalar derivable from the metric 
tensor that depends on derivatives no higher than second order, so at first our goal 
appeal's unattainable. Nevertheless, as we will show, we may use the notion of 
equivalent actions discussed in Section 19.5 to circumvent this difficulty, albeit 
in a way that results in a new action that is not a scalar under general coordinate 
transformations. 

The Lagrangian density £ EH = ^/—gR in the Einstein-Hilbert action (19.32) 
may be written as 


^EH — V—gg^' R/xv 

— r^g^idT 17 -<? r 7 +r T r 0- -r T r 0- ) 

V 55 V i' /xo - i±v 1 A /xcr A tv A /jlv 3 ' raj 

= V=gg flv (^ r V - v) - ( 19 - 4 °) 


where in the last line we have defined a new Lagrangian density 

£ = (rvr^ - rvr\„), 


(19.41) 


which clearly depends only on the metric and its first derivatives. (Note that the 
minus sign in (19.40) is for later convenience.) By relabelling indices and using 
Leibnitz’ rule for the differentiation of products, we can write the first term in 
(19.40) as 


V=8g» v (*,r V - <^ r V) = d r (V^gg^rcr - V=giT r^,) 

-<UV=^T)rv + d a {^ g nr\ v . 

(19.42) 
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To evaluate the last two terms on the right-hand side, we note that 

= ki-sr^ng+y/^gd^. ( 19 . 43 ) 

Using the result (3.24) derived in Section 3.10, we have d a g = 2gT p and, since 
the covariant derivative of the metric (or its inverse) is zero, 

V = d a g^ + r + r vr = 0. 

Thus, we may write (19.43) as 

UV^ssn = (r p pa gf lv - r p pag pv - . 

Substituting this result into the last two terms on the right-hand side of (19.42) 
(contracting on v and a for the first of these terms), relabelling indices and 
simplifying, one finds that 

V=sg^ (a„r v - d a r\ v ) = (V=gg^ r\ a - y=i*r r^)+ 2i. 

Thus, we finally discover that the Einstein-Hilbert Lagrangian density (19.40) 
can be written 

(19.44) 

where L is given by (19.41). 

We see immediately, however, that the second term in (19.44) is a total deriva¬ 
tive, and so £ EH and L arc related by an expression of the form (19.22). The 
two Lagrangian densities are therefore equivalent. As discussed in Section 19.5, 
variation of the new action 

(19.45) 

will thus lead to the same field equations as did the Einstein-Hilbert action ^EH' 
provided that the variation in the metric and its first derivative vanish on the 
boundary dX. Thus, the variation of (19.45) will again yield Einstein’s field 
equations in vacuo (which may be checked directly), but the action depends only 
on the metric and its first derivatives. There is, however, a price to pay in adopting 
the above result, since the new action S is easily shown not to be a scalar with 
respect to general coordinate transformations (see the discussion in Section 19.5). 




19.10 The Palatini approach for general relativity in vacuo 

A more elegant and illuminating method for obtaining the Einstein field equations 
from an action depending only on dynamical fields and their first derivatives is 
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provided by the Palatini approach , which we now discuss. In this formalism one 
treats the metric g and the connection T 0- as independent fields. In other 
words, one does not assume any explicit relationship between the metric and the 
connection. 

We begin again with the Einstein-Hilbert Lagrangian density 

£eh = V-ss^'R^ = V=giT V - + r- r\ v r\ a ) , 

which we now consider as a function of the metric, the connection and first 
derivatives of the connection, i.e. £ EH = <£eh(£/^’ 1’%^ jXV )- Let us first 
consider the variation in the action resulting from a variation in the metric alone. 
This may be written as 


SS EU = [ 8(^>g^’)R d A x. 


Demanding that SS EH = 0 for an arbitrary variation in the metric, we immediately 
find that 



which gives the Einstein field equations in vacuo. 

Let us now consider varying the action with respect to the connection, which 
yields 


<^EH — 8R /JLV d 4 X 

= f x [V„(Sr V) - V^sr V)] d A x, (19.46) 

where in the second line we have used the contracted version (19.36) of the 
Palatini equation. Using Leibnitz’ theorem for the differentiation of products and 
relabelling some dummy indices, we may write (19.46) as 

- f x [( V) - (V vg n 8TP^ p ] V=I d A x , (19.47) 

where we note that we have not assumed that the covariant derivative of the 
metric vanishes, since we have not (yet) specified any relationship between the 
connection and the metric. Using the divergence theorem (19.19), we may write 
the first integral on the right-hand side of (19.47) as a surface integral over the 
boundary dX, which vanishes if we assume that the variation in the connection 
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vanishes on the boundary. Relabelling some dummy indices in the second integral 
on the right-hand side of (19.47), we thus find 

SS EH = f x - VptT) 8T% v d 4 x. (19.48) 

Since we arc assuming that the manifold is torsionless, the variation 8T P IJV in the 
connection, although arbitrary, must be symmetric in its lower two indices. As a 
result, demanding that SS EH = 0 only requires the symmetric paid of the term in 
parentheses in (19.48) to vanish; when contracted with SP* the antisymmetric 
paid will automatically equal zero. Thus, stationarity of the action requires that 

kW*8 lla +mV(rg" r -W' = 0 . 

We thus deduce that V (7 g /iI ' = 0, which in turn implies that = 0. Hence by 

demanding stationarity of the Einstein-Hilbert action with respect to variations in 
the (symmetric) connection, we have derived that the covariant derivative of the 
metric must vanish. We may thus write 

d s = r' T s +P 7 s 

^pOfLV A flpoav 1 A VpOpLO" 

Cyclically permuting the free indices to obtain similiar expressions for d l ,g and 

c> l jg vp , combining the results and contracting with g pa one finds that 


+ Kgtur - dpgp.v)> 


and hence the connection must be the metric connection. 


19.11 General relativity in the presence of matter 

So far we have confined our attention to deriving the gravitational field equations 
in vacuo. We now consider how the full Einstein equations, in the presence of 
other (non-gravitational) fields, may be obtained by a variational principle. In 
order to accommodate this generalisation, one simply needs to add an extra term 
to the action to give 

S =—S EH +S M = J x ^—L Eli +d 4 x, (19.49) 

where the Einstein-Hilbert action S EH is considered as a function of the metric 
and of its first- and second-order derivatives (as in Section 19.8). is the 
‘matter’ action for any non-gravitational fields present, and k = 8-7rG/c 4 . The 
factor 1/(2 k) in (19.49) is chosen for later convenience. 
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Let us now consider varying the action with respect to the (inverse) metric, to 
obtain 

1 §£ M _ q 

2k 8g^ v + 8g^ v ~ 

From (19.38), we see that 


^EH 

8g' lv 


y/ 8 ^)jlv’ 


where G l±1 , = R jlv — yg^.R is the Einstein tensor. Thus, if we make the bold 
assertion that the energy-momentum tensor of the non-gravitational fields (or 
‘matter’) is given by 


„ 2 5--C]y[ 

y/^gSg^' 


(19.50) 


then we recover the full Einstein equations 


The definition (19.50) of the ‘matter’ energy-momentum tensor may appeal - to 
be somewhat arbitrary. Nevertheless, as we show in the next section, this tensor 
has all the properties required of an energy-momentum tensor. 


19.12 The dynamical energy-momentum tensor 

The quantities T /J V defined in (19.50) are clearly the components of a tensor, which 
is known more properly as the dynamical energy-momentum tensor. From the 
definition we also see immediately that T is a symmetric tensor, as is required 
by the full Einstein equations (19.39). Most importantly, however, we now show 
that it obeys the conservation equation V ;J T 111 ' = 0. 

From the definition (19.50), the variation in the matter action resulting from a 
variation in the metric is given by 

r 8/2 r 

SSM SS “' X = 2 L 

= -iJ jj Ti‘-Ss^V=S‘l , x, (19.51) 

where, in the last equality, we have written SS M in terms of the contravariant 
components T IJ l ’ of the energy-momentum tensor for later convenience, using the 
result (19.33). Let us now consider making an infinitesimal general coordinate 
transformation 


x ,/x = x 11 + e(x), 


( 19 . 52 ) 
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where £^(x) is an infinitesimal smooth vector field. Since the action .S M is, by 
construction, a covariant scalar, then we must have 8S M = 0 under the coordinate 
transformation. We know, however, that the metric coefficients must transform as 

g'nvix') = J^J^gpaA) = B-W] [K~ 3 vi a (x)]g pa (x) (19.53) 
§fj.v{x) 8pv(.x)dfu^(x) 8fur(.X)d v £ (x), 

(19.54) 

to first order in ^, where we have used the expression (17.3) for the trans¬ 
formation matrix corresponding to the infinitesimal coordinate transformation 
(19.52). We have explicitly included the dependence on x and x' in (19.54), since 
it is crucial to determining the corresponding variation 8g^ v . As mentioned in 
Section 19.3, this variation is only of the functional form of the fields g^ v . Thus, 
we have 

= g'nv( x ) - gpv(x) = [g' p A x ') - « F W] _ - 4*'W] 

= [g'p,v(x r ) - gp. v {.x)]- ^{x)d a g^ v {x) 

= -^i.x)9 a gp.vi.x), 

to first order in Using the expression (19.54) and dropping the explicit 
dependence on x, we find that 

H pv = -g pv d^ p - gpp dJ p - ed pgpv = -( + V„^), 

where, in the second equality, we have rewritten the partial derivatives in terms of 
covariant derivatives, cancelled matching terms involving connection coefficients 
and used the fact that ^ p g pv = 0. 

Substituting this result into (19.51) and remembering that 8S M = 0 under a 
coordinate transformation and that T 11 " is symmetric, we have 

SS M = f x T pv (\UV=8d 4 x = 0. 

Using Leibnitz’ theorem for the covariant differentiation of a product, we write 

8S m =[ V7^)V=^ 4 x- [ (V,n^^A = 0. (19.55) 

Jx lx 

We may use the divergence theorem (19.19) to write the first integral as a surface 
integral over the boundary <IX in the usual manner. Assuming that the functions 
U'(x) vanish on the boundary dX this surface integral vanishes, leaving only the 
second integral in (19.55). Since the ^(x) arc arbitrary, however, one immediately 
finds that 

V /X T IJ - V — 0, 
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and so the covariant divergence of the energy-momentum tensor vanishes, as 
required. Thus, we see that the general covariance of the matter action implies 
energy-momentum conservation in the same way as the gauge invariance 
of the action (19.29) for electromagnetism implies charge conservation (see 
Section 19.7). 

Now that we have shown that the tensor T jJV defined by (19.50) has the 
appropriate properties of an energy-momentum tensor, we may calculate the 
explicit form of this tensor for some specific ‘matter’ actions. Let us begin by 
considering the action (19.25) for a real scalar field 0. Varying this action now 
with respect to the (inverse) metric, rather than the field 4>, we obtain 

ss t = j x {[s^’wxv^)] 

+ [^( V)(v^) - vw>)] s(v=g)} d 4 x 

= - \g„. v [k P "(V p 0)(V.0) - V(0)]} 8tTJ=gd*x, 

where in the last line we have used the expression (19.37) for S^—g. Comparing 
the above expression with that in (19.51), we immediately see that the energy- 
momentum tensor for a real scalar field is given by 


T$> = (WWA) -gw [^(V»(V CT 4>) - V(<f >)], 


which agrees with the expression (16.7) adopted in our discussion of inflation in 
Section 16.1. 

We may also obtain the energy-momentum tensor for the electromagnetic field 
in a similar manner. From (19.29) and (19.28), in the absence of sources we may 
write the action for electromagnetism as 

5 em = - j x g M> f <TF P < j F , rv d 4 x, 

where F pv = d p A v — d v A p and so does not depend on the metric. Varying this 
action with respect to the (inverse) metric, we have 

SS EM = ~4^J x F p< r F „ v ^F~g + F P a FP(r S(V=g)] d 4 x 

= ~4^f x ( 2 S P,T Vr<r ~ Wv F P <r FP(J ) 8f V J=gd*X, 

where in the second equality we have substituted the expression (19.37) for 
S^/—g and relabelled some dummy indices. Comparing the above expression with 
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(19.51), we find that the energy-momentum tensor for the electromagnetic field 
is given by 

= -/V (F w F v r - \g lxv F p(r F»«) , 

which agrees with the expression derived in Exercise 8.3. 

Finally, we note that in field theory it is common to define also a canonical 
energy-momentum tensor, which is based on Noether’s theorem. 4 This states that 
for every symmetry of the action there exists a corresponding conserved quantity. 
In particular, if an action is invariant under a spacetime translation, characterised 
by a coordinate transformation of the form — x 11 + a 11 in which the vector 
a 11 does not depend on spacetime position, then one can define a tensor S 111 ' that 
obeys = 0. It is this tensor that is usually called the canonical energy- 

momentum tensor. Unfortunately, there arc some drawbacks in using it, since it 
is not necessarily symmetric (although it can be made so) or gauge invariant. 


Exercises 


19.1 If p(x ) and t(x) are the local line density and tension of a string, show that the 
kinetic and potential energies of the string for small displacements 4>(t, x) are given 
by 



19.2 In classical field theory, the conjugate field momenta are defined in terms of the 
Lagrangian density L by 

dL 

TT„ = ——, 

d<$>° 


where <J> fl = <9„<1>" and x° is a timelike coordinate. The Hamiltonian density is then 
defined as 

tH = TT a <& a - L. 


Use the Euler-Lagrange equations to show that 


4> fl = 


8J{ 
Sir a 


and ir a 


19.3 Consider the quantity 

E = f d 3 x, 
Js 


8H 
8 cU' 


See, for example, L. H. Ryder, Quantum Field Theory, Cambridge University Press, 1985. 
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where 0~C is the Hamiltonian density in Exercise 19.2 and the integral extends over 
some spacelike hypersurface S for which x° = constant. Setting [ „r M | = (t, x') and 
using a dot to denote d t , show that 


dE 

dt 


ddC • 
-<|)" - 


dd{ 

dlt n 


dd{ 

d(d,d> n )' 


-d& a 


dJ{ 

dt 


drx. 


By integrating the third term in the integrand by parts, show that dE/dt — 0 
provided that 3~C does not depend explicitly on t. 

19.4 Obtain an expression for the Hamiltonian density 0~C for the string in Exercise 19.1. 
Hence show that the total energy E of the string is given by 

E — f Ji dx , 

J o 


and show explicitly that it is a constant of the motion. 

19.5 A relative tensor of weight w transforms under a coordinate transformation as 


TV = r 


,dx' a 

dx c 


dx'\ 

dx d 


where J is the Jacobian of the transformation and is given by 


J — det 


~dx' a 

dx b 


Show that the product of two relative tensors of weights w , and w 2 is a relative 
tensor of weight w l + w 2 . Show further that ^/—g is a relative scalar of weight 
w = 1 (called a scalar density). 

19.6 For a field theory defined by the action S — f x Ld 4 x show that, if L depends 
on first- and second-order derivatives of the fields, the Euler-Lagrange equations 
take the form 


8L dL 

' dL ' 

+ d»d v 

dL ' 

8& ~ 11 




provided that the variations 5<t>" and their first derivatives vanish on the boundary 
dX. How do the Euler-Lagrange equations generalise when L depends on higher- 
order derivatives of the fields? What assumptions are required regarding the value 
of the variation 5<1>" and its derivatives on dR? 

19.7 Consider a local field theory for which the action has the form 


S= f L[Q a (x),d <i> a (x)\d 4 x. 

Jx 

Under an infinitesimal general coordinate transformation = W + the 

variation in the action is given by 


8S = f £'[<S>' a (x?), d <t>' a (x')] d 4 x' — f £[<E> fl (x), d <J> a (v)] cl 4 x. 
Jx' J X 



Exercises 


551 


Adopting the shorthand notation 8S — f x , L'{x') d A x! — f x L(x) d A x, show that 


8S = f [A£(x) + L(x)d £ M (.r)] d A x = f {5£(x) + d [£(x)£ M (x)]}<7 4 x, 
Jx Jx 


where A£(x) — L'{x') — L{x) and 8L(x ) = L'(x) — L(x). 

19.8 Suppose that the action in Exercise 19.7 is invariant under the given coordinate 
transformation, so that 8S — 0. Since the range of integration X can be chosen 
arbitrarily, show by writing 


a^j „ oxj 

8L= -S<1> +- 

d(d®°) 


5(5 M ch«), 


or otherwise, that 


dL 


dL 


*(W. 




dL 


l_3(W 


+ LL J - 


= 0 . 


Hence show that the invariance of the action under the given coordinate transfor¬ 
mation implies that 3 / = 0, where 


dL 




A<E a - 


dL 


l_d(«W V 


<9„<E> a - 8?L 


in which A<& a (x) = <J>' a (x') — 4> a (x). This result is known as Noether's theorem. 

19.9 Use your answer to Exercise 19.8 to show that invariance of the action under the 
infinitesimal translation x— x M + implies that d fJL S ,L v = 0, where 


S'*, 


dL 

W^j 


5„<f> a - 8?,L, 


which is known as the canonical energy-momentum tensor of the fields <J> a . Is 
S 11 v necessarily symmetric in fi and vl 

19.10 For the field theory considered in Exercise 19.7, use the fact that L does not 
depend explicitly on the coordinates x M to write 


dL dL 

d v L = ^ ^ 

' d<$> a v 3 3 $“ a M 


By multiplying the Euler-Lagrange equations by <?„<J>" and summing over a, use 
the above result to show directly that 5 = 0, where S >L V is the canonical 

energy-momentum tensor given in Exercise 19.9. 

19.11 Consider the ‘modified’ energy-momentum tensor 


& l v = S ll v + d a \lf» v , 


where S >L V is the canonical energy-momentum tensor given in Exercise 19.9 and 
<is any tensor that is antisymmetric in cr and j±. Show that d p © M „ = 0 and 
that one can always arrange for to be symmetric in /r and v. 
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19.12 Consider a local field theory defined on Minkowski spacetime in an arbitrary 
coordinate system with metric g . The action has the form 


^ = S^(x)]V^gd 4 x, 


where the fields are independent of the metric g pll and L is a scalar under 
general coordinate transformations. Use the fact that L does not depend explicitly 
on jc m to write 


dL dL 

V L = -V H-V V 

d<t>° v M 


By multiplying the appropriate form of the Euler-Lagrange equations by V/b", 
summing over a and noting that covariant derivatives commute in Minkowski 
spacetime, use the above result to show that = 0, where the covariant 

canonical energy-momentum tensor S' 1!J is given by 


S^ = 


dL 


d{V^) 


V''<b“ _ g^'L. 


19.13 Consider the ‘modified’ energy-momentum tensor 


& l v = S ,1 v + V <J ift ar,l v 


where S 11 „ is the canonical energy-momentum tensor given in Exercise 19.12 and 
is any tensor that is antisymmetric in <x and /jl. Show that, in a flat spacetime, 
V & l v = 0 and that one can always arrange for b) M ,, to be symmetric in /j. and v. 

19.14 In a four-dimensional spacetime, use the divergence theorem to show that 

f d 4 x= [ nrfj=yd?y, 

Jx Jsx 

where i> M is an arbitrary vector field, y is the determinant of the induced metric 
on the boundary in the coordinates y l and n p is a unit normal to the boundary. 

19.15 Consider complex scalar field (f> = (</>i + i<^ 2 )/V2> where <p t (i = 1,2) are real 
scalar fields with potentials of the form V — \ m 2 (f> 2 . Show that the Lagrangian for 
4> may be written as 


where the asterisk denotes the complex conjugate. By varying <fi and <fi* indepen¬ 
dently, show that 

O 2 <p + m 2 (f> — 0 and 0 2 <p* + m 2 (f)* = 0, 

where D 2 = V M V M = g^V^V,, is the covariant d’Alembertian operator. 

19.16 In the theory of electromagnetism in arbitrary coordinates, the field tensor is 
defined by F v = s H jJL A v — V,,A . Show directly that 
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and that 

V F A-V F 4 -X7F — r) F A- r) F A- r) F 

V (T 1 /JLV ' V V 1 (Tfl' y fl 1 l>(T U (J L jXV > (7/X ' U fJL 1 V(T ‘ 

Hence show that F^ v automatically satisfies the relation 
VF +VF +VF =0. 

a ixv 1 y v (T[jl ' ix va 

19.17 If F^ v = V fJL A v — V^A^, show that 

KF, V + = 2 (R p VfLa + R p p m , + R p OV)JL )Ap, 

where R p vlLlT is the Riemann tensor. Hence use the cyclic identity (??) to show 
that the above expression is zero. 

19.18 An alternative Lagrangian for electromagnetism is given by 

L — —F al ,F pv -—F flv (V a A-V l ,A a )-fA u , 

4n 0 2n 0 V M J p 

where F /±l , and A p are considered as independent quantities (i.e. no functional 
relationship between them is assumed). By varying the corresponding action with 
respect to F /±v and A 11 independently, show that the Euler-Lagrange equations 
yield 

= p. 0 j v and Fp V = V M A„ - V„A M . 

19.19 The Lagrangian for a free massive vector field A 11 of mass m is 

L = -\g^g m {VpA <T -V tr A p ){VpA v -V v Ap)-\m 1 A ll A p . 

Show that the field equation for A M is given by 

VJV'A' 1 - V^A”) + m 2 A v = 0. 

By making use of the fact that covariant derivatives commute in Minkowski 
spacetime, show that in this case V v A v = 0 and hence that the field equation can 
be written 

□ 2 A m + ot 2 A m = 0, 

where D 2 = V M V^ = V, is the covariant d’Alembertian operator. These are 
called the second-order Proca equations. 

19.20 An alternative Lagrangian for a free massive vector field A 11 of mass m, is 

L = \Fp V F^ - \F pv (VpA v -\Ap) - \m 2 ApA p , 

where F^ v and A 11 are considered as independent quantities. By varying the 
corresponding action with respect to F [±l , and A 11 independently, show that the 
Euler-Lagrange equations yield 

\F^ + nr A v = 0 and = V, A v - V„ A M , 

which are called the first-order Proca equations. 
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19.21 The simplest scalar action for gravity in vacuo that one can construct from the 
metric tensor alone is 

S = [ 

Jx 

Show that the corresponding field equations are given by — 0 and clearly 

do not constitute a viable theory of gravity. 

19.22 Under a general infinitesimal coordinate transformation of the form x nl — x 11 + 

show that 

- g M „(x) = -(V„ + V^). 

19.23 Consider a general action for gravity in vacuo of the form 

s = f x 8v.v, d ,jg^v, dpdagpv, ■ ■ •) d A x. 

By considering a general infinitesimal coordinate transformation of the form x ,tl — 
x r q. ^(x), where the £ M (x) vanish on the boundary d x , show that the metric and 
its derivatives must satisfy the differential constraints 



where SL/Sg is the variational derivative of the Lagrangian density with respect 
to the metric. Hence show that for the Einstein-Hilbert action these differential 
constraints lead to the contracted Bianchi identities V ;J C? 11 ' = 0. 

19.24 Show explicitly that the quadratic action 

~S = f x r(rv r ro - r V r<r ™)v^^ 

is not a scalar with respect to general coordinate transformations. Show further 
that varying this action with respect to the metric and its first derivative leads to 
the Einstein field equations in vacuo , provided that the variation in the metric and 
its first derivative vanish on the boundary dX. 

19.25 Obtain an expression for the dynamical energy-momentum tensor of the complex 
scalar field considered in Exercise 19.15 and that of the massive vector field 
considered in Exercise 19.19. 
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and instantaneous rest frames, 126-7 
in special relativity, 19-20 
radial, 335 

three-acceleration, 123 
uniform, 21 
universe, 187 
see also four-acceleration 
accretion discs 

around compact objects, 240^4, 277 
radiation efficiency, 215-16, 240, 338 
accretion power, of black holes, 216 
action, 525, 526-7 
and classical field theory, 527-9 
equivalent, 533^4 

and general relativity in vacuo, 542-3 
stationary, 529-30 
see also Einstein-Hilbert action 
advanced time parameter, 255 
affine connection, 62^4 
and metric functions, 65-7 
definition, 63 
symmetry, 65 

transformation properties, 64 
affine parameters, 75-6, 117, 120, 221-2, 340 
geodesics, 76-7 
amplitude tensor, 499-500 
Andromeda galaxy, 355 
angular diameter distance, 371, 373^1, 411-13 
angular momentum barrier, 213, 214 
angular speed 
coordinate, 245 
proper, 245 
antiparticles, 275 
aphelion, 230 


Arcminute Cosmology Bolometer Array Receiver 
(ACBAR), 460 
area, manifolds, 38—42 
atlas, 27 

basis tensors, 100 
basis vectors, 56-9, 69 

and coordinate transformations, 60-1 

Cartesian, 113-14 

derivatives, 62^1, 84 

dual, 56-7, 84 

orthonormal, 59 

polar coordinates, 83 

timelike, 152 

see also coordinate basis vectors 
Bianchi identity, 161, 162 
big-bang origin, 399-400, 404, 409, 419 
big-bang theory, 394, 398^400 
big-crunch theory, 395, 398^1-00 
binary system, 508-9 
compact, 277-9 
spin-up, 516-17 
Birkhoff’s theorem, 202 
black hole, 240, 270 
accretion power, 216 
angular momenta, 260 
charged, Reissner-Nordstrom geometry, 

300-2 

definition, 257 

detection, 277-9 

dynamical mass limits, 279 

existence of, 258, 260 

formation of and gravitational collapse, 

259-64, 277 
Hawking effect, 274-7 
in binary systems, 277, 278 
singularities, 258, 270 
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tidal forces near, 264-5 
see also Kerr black hole; Schwarzschild black 
hole; supermassive black hole; white hole; 
wormhole 
blackbody 

energy spectrum, 276 
radiation, 388-9 
temperature, 277 
bounce model, 398 

Boyer-Lindquist coordinates, 318, 319, 320, 322, 
344, 347 

Boyer-Lindquist form, 318 
Brans-Dicke theory, 191-2, 235, 236 
Buchdahl’s theorem, 296 

bug, two-dimensional, confined to two-dimensional 
surface, 33-4, 54 

calculus of variations, 87-8 
Cartesian basis vectors, 113-14 
Cartesian coordinates, 1, 27, 33-5, 320, 346-7, 525 
advantages, 128 

local, 42-4, 46-7, 48, 67-8, 160 
Lorentz transformation, 112-13 
Minkowski spacetime, 111-12 
rotations, 26 

Cartesian inertial frames, 47, 112, 122, 141, 

142, 149 
global, 151 

local, 150-1, 153, 177-8, 179 
centre of mass, worldline of, 170 
centre-of-momentum coordinates, 482, 507 
centrifugal force, repulsive, 217 
Cepheid variable, 370 

Chandrasekhar, Subrahmanyan (1910-95), 259 
Chandrasekhar limit, 259, 277 
charge and electromagnetic force, 135-6 
charge density, 136-7 
proper, 508 

charge density distribution, 508 
charged particle, equations of motion, 144-5 
Christoff el symbol, 106 
of the first kind, 66 
of the second kind, 63 
see also affine connection 
circular motion, 209, 335 
massive particle, 212-13 
photon, 219-20 

see also equatorial circular motion 
circular orbits, 213, 243—1 
stable, 214, 215, 243 
unstable, 214 

circularly polarised mode, 507 
classical field theory, 524 
and action, 527-9 
clocks, ideal, 11 
cold dark matter (CDM), 387 


comoving coordinates, 443, 467n 
and fundamental observers, 356-9 
comoving Hubble distance, 421 
compact-source approximation, 481-3, 507-8 
complex functions, analytic continuation, 254 
components 
metric, 83—4 
mixed, 94, 97 

tensor, 93-1, 100, 102, 103^4 
vector field, 73 

see also contravariant components; covariant 
components 
Compton effect, 124 

Compton scattering, and relativistic collisions, 
123-5 

configuration space, 79, 525 
congruence of timelike worldlines, 356a 
connection coefficients, 85, 202, 244, 317 
of general static isotropic metric, 200-1 
conservation of energy for perfect fluid, 179-81 
conservation of momentum for perfect fluid, 
179-81 

continuity, equation of, 180 
contraction, 66 
Lorentz, 177 
tensor, 99-100 

contravariant components, 56, 57, 59-60, 68 
tensors, 93—1, 100 
coordinate angular speed, 245a 
coordinate basis vectors, 57-9, 113-14, 115-16 
spacelike, 275 
timelike, 275 
coordinate distance, 371 
coordinate patches, 27 
coordinate singularity, 37, 250, 341 
coordinate transformations, 5, 8 
and basis vectors, 60-1 
infinitesimal general, 468, 469-70 
manifolds, 28-30 
tensor, 101-2 
coordinates, 26-52 

arbitrary, 128-31, 142-1, 145, 151, 155-6, 179 

Boyer-Lindquist, 318, 319, 320, 322, 344, 347 

characterisation, 248 

concept of, 27 

Kruskal, 258, 266-71, 273 

Minkowski, 297-8 

momentum, 26 

non-degenerate, 27 

Novikov, 254n 

null, 248, 258 

quasi-Minkowski, 467-8, 485 
spacelike, 248, 249-50, 266, 273 
tensor, 102-3 

timelike, 248, 249-50, 256, 258, 266, 273 
unique values, 27 
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coordinates (cont.) 

see also Cartesian coordinates; comoving 
coordinates; Eddington-Finkelstein 
coordinates; inertial coordinates; 
polar coordinates; Schwarzschild 
coordinates 
corona, solar, 239 

cosmic censorship hypothesis, 301, 324 
cosmic microwave background (CMB) 
characterisation, 388-9 
distribution, 389 
power spectrum, 459-62 
cosmic time, 419 

cosmological constant, 185-8, 376, 386, 

407-8, 432 
effective, 430 
size of, 188 

cosmological field equations, 386, 392-3, 407, 
433, 434 
derivation, 376-9 
cosmological fluids 
components, 386-9 
equations of motion, 379-80 
multiple-component, 381 
scalar fields as, 431-2 
cosmological models, 386—421 
analytical, 400-8 
cosmological parameters, 390-2 
cosmological principle, 355-6 
cosmological redshift, 367-8 
covariant components, 57, 59-60, 81 
of tensors, 93^1, 100, 112 
covariant derivative, 68-70, 85-6 
of tensor, 104-7 
critical density, 392 
curl, vector field, 71 
curvature, 33 

and geodesic deviation, 165-7 
and parallel transport, 163-5 
Gaussian, 161, 171 
of manifold, 157-8 

see also spacetime curvature; spatial curvature 
curvature density parameter, 391 
curvature perturbations 

and gauge invariance, 446-9 
evolution, 449-52 
initial conditions, 452-5 
normalisation, 452-5 
power spectrum, 456-7 
curvature scalar, see Ricci scalar 
curvature spectrum, definition, 456 
curvature tensor, 75, 158-9, 182, 250, 267 
properties, 159-61 
in Schwarzschild coordinates, 264 
spherical surfaces, 161, 170 
curve in manifold, 27-8 


closed timelike, 301, 327 
non-null, 75-6 
null, 75-6 
parametric, 28 
tangent vector, 55-6 
curved spacetime, 181, 244 
electromagnetism in, 155-6 
geodesic motion, 188 
observers in, 152-3 
tidal forces in, 167-70 
see also spacetime curvature 
cyclic identity, 160 
cylinder, 34 

parallel transport around, 165 

d’Alembertian operator, 71, 140, 144, 148, 
471, 475 

covariant, 432, 535 
dark matter, 387 
de Sitter model, 398 
properties, 407 

deceleration parameter, 368-71 
delta function, four-dimensional, 189 
density parameters, 390-2, 410 
curvature, 391 
evolution, 415-17 
present-day, 415 
total, 392 
derivatives 
absolute, 72 

of basis vectors, 62—4, 84 
see also covariant derivative; directional 
derivative; intrinsic derivative 
development angle, 402 
differential manifold, 26 
directional derivative, 56 
vectors as, 81-2 

distance-redshift relation, 411-13 
divergence, of vector field, 70 
divergence theorem, 532 
Doppler effect, 16-18, 240 

and relativistic aberration, 120-1 
formula for, 18 
dual basis vectors, 56-7, 84 
dust, 178, 182 

spherically symmetric collapse of, 260-4 
use of term, 176 
worldline, 190 

eclipses 
lunar, 235 
solar, 235 

Eddington, Sir Arthur Stanley (1882-1944), 
235, 542 

Eddington-Finkelstein coordinates, 254-9, 303 
advanced, 258, 261, 262, 270, 303, 346 
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definition, 254-9 
limitations, 266 
and Kerr geometry, 344-6 
retarded, 270 
definition, 257-9 
limitations, 266 
effective potential, 335 
general relativistic, 214-15 
Newtonian, 213-14 
Einstein, Albert (1879-1955) 

and cosmological constant, 185-8, 407-8 
and special relativity, 22-3 
elevator experiment, 148-50 
general relativity theory, 150, 233 
‘On the Electrodynamics of Moving Bodies’ 
(1905), 22 

Einstein-Cartan theories, 193 
Einstein-de-Sitter (EdS) model, 398, 402, 410, 413 
radiation-dominated, 420 
Einstein equations, 176, 181-3 
and cosmological term, 185-8 
and geodesic motion, 188-90 
exact, 487 

in empty space, 183, 202 
limitations, 317 
non-linearity, 473 
perturbed, 443-5 
solving, 196, 198-202, 248, 270 
for spherically symmetric geometries, 288-305 
weak-field limit, 184-5 
see also gravitational field equations 
Einstein-Hilbert action 

and general relativity in vacuo, 539-42 
Lagrangian density, 544 
stationarity, 545 
variation, 540 

Einstein-Hilbert Lagrangian density, 543, 544 
Einstein-Maxwell coupled equations, 297-8, 300 
Einstein-Maxwell formulation of linearised 
gravity, 490-2 
Einstein-Rosen bridge 
and wormholes, 271-4 
structure, 273 

Einstein tensor, 162, 183, 442-3, 444, 546 
linearised, 487 

Einstein’s static universe, 407-8 
electric fields in inertial frames, 141-2 
electrodynamics, 22, 189-90 
electromagnetic field, energy-momentum tensor 
for, 548-9 

electromagnetic field equations, 138-9, 176 
derivation, 136-7 

in arbitrary coordinates, 142—4, 155-6 
simplification, 140-1 
see also Maxwell’s equations 
electromagnetic field tensor, 138, 139, 140, 156 


antisymmetric, 136 
components, 142, 176 
definition, 136 
electromagnetic forces, 148 
and charge, 135-6 

electromagnetic radiation, generation of, 508 
electromagnetic waves and gravitational waves 
compared, 501 

electromagnetism, 135—46, 508 

and Lorentz gauge conditions, 139—41 
and special relativity, 135 
consistent theory of, 135 
from variational principles, 536-9 
in arbitrary coordinates, 142—4 
in curved spacetime, 155-6 
electron degeneracy pressure, 259 
elevator experiment, 148-50 
ellipticity of planetary orbits, 230, 231 
emitters 

four-velocity, 241 
gravitational redshift, 202-5, 315 
empty-space field equations, 288 
solutions, 198-202, 248 
energy, 118-19 

conservation of, 179-81 
potential, 535 
energy density 

of universe, 390, 433 
of vacuum, 187, 390 

energy equation for particle motion, 213-14 
energy-momentum invariant, 119 
energy-momentum tensor, 176-8, 182, 188, 
192, 484 

and spacetime curvature, 176 

canonical, 549 

dynamical, 546-9 

for electromagnetic field, 548-9 

for gravitational field, 486-90, 511-13 

for matter, 546 

for multiple-component fluids, 381 
for perfect fluid, 178-9, 187, 377-9, 432 
for scalar field, 432, 444-5 
non-zero, 288, 296-7, 475 
of vacuum, 187-8 
symmetry of, 179 
epoch, 418, 19 
inflationary, 433, 437 
of recombination, 420 
equation of continuity, 180 
equation of state, 292 
polytropic, 293 

equation-of-state parameter, 380 
equations of motion, 148 
and Newtonian gravity, 209 
Euler’s, 181 

for charged particles, 144-5 
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equations of motion (cont.) 
for cosmological fluid, 379-80 
for perfect fluids, 179-81 
for photons, 119 
for scalar field, 433 
geodesics, 188-90 
Newtonian, 154-5, 230, 486 
radial, 304-5 
relativistic, 180 
equatorial circular motion 
massive particle, 335-6 
photon, 341 
equatorial orbits 

massive particle, stability, 337-8 
photon, stability, 342^1 
equatorial planes, geodesics in, 330-2 
equatorial trajectories 
energy equation, 331 
massive particle, 332-3 
photon, 338-9 

equivalence principle, 148-50, 191 
equivalent mass densities, 390, 392 
ergoregion, 324, 325-7 
Euclidean geometry, 4 
Euclidean metric tensor, 505 
Euclidean space, 27 
four-dimensional, 37-8 
pseudo-, 47, 54, 114 

three-dimensional, 26, 33, 36-7, 40-2, 70, 271-2 
Euler angles, 26 

Euler-Lagrange (EL) equations, 78, 88, 209, 349, 
432, 525, 529-30 
alternative forms, 531-2, 538 
alternatives to, 331 

substitution of ‘Lagrangian’ into, 79-80, 199, 
205-6 

Euler’s equation of motion, 181 
event horizons, 257, 269, 274^5, 301, 315-16, 420 
formation, 260 
in Kerr metric, 323 
smooth closed convex, 317 
in special relativity, 21-2 
events, unique specification, 1 
expansion problem, 428 
experimental tests 

and Schwarzschild geometry, 230 
of general relativity, 230—46 
exponential expansion, 439^10 
extrinsic geometry, 33-6 

Fermi-Dirac statistics, 259 
Fermi energy, 259 

Fermi-Walker transportation, 127, 152, 153 
field equations, 524 
dynamical, 536 
homogeneous, 476 


non-linearity of, 196 
perturbed, 443-5 
vacuum, 249 

see also cosmological field equations; 
electromagnetic field equations; 
empty-space field equations; gravitational 
field equations; linearised field 
equations 

field Lagrangian, 528-9 
field theories 

Minkowski spacetime, 487 
of real scalar fields, 534-6 
see also classical field theory 
field-strength tensor, 537, 538 
fixed spatial coordinates, 223, 315 
flatness, conformal, 267, 282-3 
flatness problem, 418, 428 
solving, 429-30, 436, 442 
fluid 

four-velocity of, 177-8 
in instantaneous rest frame, 176-7, 178-9 
Lorentz contraction of, 177 
multiple-component, 381 
see also cosmological fluid; perfect fluid 
fluorescence, 240 
force 

gravitational, 147, 148 
pure, 122 

repulsive centrifugal, 217 
see also electromagnetic force; four-force; 
three-force; tidal forces 
four-acceleration, 123, 125-7, 152, 153 
orthogonal, 125 

four-current density, 136-7, 156, 176 
components, 137 

four-dimensional rotations, Lorentz transformations 
as, 5-6 

four-force, 122, 135-6, 156 
pure, 123 
four-gradient, 138 

four-momentum, 123, 126, 144, 207, 274-5, 331 
and Compton scattering, 123-5 
conservation along geodesic, 312-13 
of massive particle, 118-19 
of photon, 119-20, 222-3, 224, 242, 244 
four-potential, 297 
four-tensors, 136, 152, 250 
four-vector potential, 142, 143 
four-vectors, 120, 136, 138-9, 349 
and gyroscopes, 244-6 
and lightcones, 115-16 
and Lorentz transformations, 116 
as geometrical entities in spacetime, 115 
four-velocity, 168-9, 190, 276, 

325-6, 349 
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Fourier space, 458 

perturbation equations in, 445-6 
and gyroscopes, 244-6 
definition, 116-18 
normalised, 126, 152 
of charged particle, 135-6, 144, 156 
of emitter, 241 
of fluid, 177-8 
of free particle, 504 
of massive particle, 207, 304 
of perfect fluid, 290 
spatial components of, 223 
four-wavevector, 499-500, 501 
and Doppler effect, 120-1 
concept of, 120 

Fowler, Ralph Howard (1889-1944), 259 
frames of reference, 1 
free particle, 123 

gravitational-wave effects, 504-7 
freely falling frame (FFF), 152-3 
frequency shift, 240 

see also Doppler effect; redshift 
Friedmann equations, 379 
Friedmann expansion, 440 
Friedmann-Lemaitre equations, 379 
Friedmann models, 400-3, 419 
dust-only, 401 

radiation-only, 403, 430, 436 
early-time, 439 
spatially flat, 402, 403 

Friedmann-Robertson-Walker (FRW) geometry, 
355-81, 467n 
spatial curvature 
negative, 364-5 
positive, 363-4 
zero, 364 

number densities, 374-6 
proper volume, 375 
volume element, 375 

Friedmann-Robertson-Walker metric, 362, 

386, 442 

geodesics in, 365-7 
geometric properties, 362-5 
Friedmann-Robertson-Walker universes, 
properties, 393^4 

fundamental observers and comoving coordinates, 
357-8 

future-pointing vectors, 116 

G2000 + 25, 279 
Galactic centre, 282 
galaxies, 186-7, 355, 420 

and fundamental observers, 358 
Andromeda, 355 
comoving coordinates, 357 
distribution of, 462 


Milky Way, 186, 280, 355 
proper time, 357-8 
spectra, 243 
worldlines, 356 
Galilean transformations, 3, 4 
Galilei, Galileo (1564-1642), 148 
gauge 

choice of, 442-3 
longitudinal, 443 

see also Lorentz gauge conditions; 
transverse-traceless (TT) gauge 
gauge freedom, 140 
gauge invariance, 536, 548, 549 
and curvature perturbation, 446-9 
gauge transformation, 140, 472 
Gauss’ theorem, 482-3 
Gaussian curvature, 161, 171 
general relativity 
and matter, 545-6 
experimental tests of, 230^-6 
in vacuo 

and Einstein-Hilbert action, 539^42 
equivalent action, 542-3 
Palatini approach, 543-5 
linearised, 467-92 
predictions, 235 
sign conventions, 193 
theory of, 150, 183, 233 
variational approach, 524^49 
see also special relativity 
geodesic convergence, 167 
geodesic coordinates, local, 68-9 
geodesic deviation 
and curvature, 165-7 
equation of, 165, 167, 168 
geodesic equations, 77, 78-9, 145, 154, 

189-90, 504 
alternative forms, 81 
integration, 206 

geodesic motion and Einstein equations, 188-90 
geodesic precession effect, 246 
geodesics, 76-7 
congruence, 356 
in equatorial plane, 330-2 
in Friedmann-Robertson-Walker metric, 365-7 
in Minkowski spacetime, 128 
in Schwarzschild geometry, 205-7 
Lagrangian procedures, 78-80, 199, 205 
non-null, 80, 123, 206, 332 
stationary property, 77-8 
null, 80, 123, 203, 206, 256 
principal, 339 
polar coordinates, 86-7 
timelike, 168, 244 
geometry 
Euclidean, 4 
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geometry (cont.) 
extrinsic, 33-6 
intrinsic, 33-6 
Kerr, 230, 310-50 
Newtonian, 3 
of manifolds, 31 
Riemannian, 32-3 
spacetime, 3-5 

see also Friedmann-Robertson-Walker 
geometry; Minkowski geometry; 
non-Euclidean geometry; 
Reissner-Nordstrom (RN) geometry; 
Schwarzschild geometry 
gradient 

four-gradient, 138 
in scalar field, 70 

grand unified theories (GUTs), 431 
phase transitions in, 436, 438-9 
gravitational binding energy, 337-8 
gravitational collapse 

and black-hole formation, 259-60, 261-4, 277 
and redshift, 263^1 
concept of, 259 
free-fall, 263 

gravitational deflection formula, 234 
gravitational effects included in field equations, 156 
gravitational field equations, 176-93, 376 
in empty space, 183 
non-linearity of, 196, 467 
see also Einstein equations; linearised field 
equations 

gravitational field tensor, 514 
gravitational fields 

energy-momentum tensor, 486-90, 511-13 
non-vanishing, 184 
weak, 153-5, 467-70 
gravitational focussing, 413 
gravitational forces, 147, 148 
gravitational Lorentz force law, 491 
gravitational mass, 147, 149 
gravitational matter density, 147-8 
gravitational Maxwell equations, 491 
gravitational perturbations, 502 
gravitational potential, 147, 155, 168, 185, 201 
Newtonian, 486 

gravitational radiation, 508, 516-17 
gravitational redshift, 486 

for fixed emitter or receiver, 202-5, 315 
general approach, 221-4 
gravitational waves, 498-520 

and electromagnetic waves compared, 501 

and linear strain, 518-19 

detection, 517-20 

effect on free particle, 504-7 

emission, energy loss, 513-16 

energy flow, 511-13 


existence, 498 
generation, 507-11 
polarisation, 510-11 
see also plane gravitational waves 
gravitational-wave luminosity, 514, 515 
gravitoelectric fields, 491 
gravitomagnetic fields, 491 
gravity, 135 

as spacetime curvature, 150-1 
strong-field regime, 240 
theories of, 524 
Brans-Dicke, 191-2, 235, 236 
relativistic, 191-3 
scalar, 190 
scalar-tensor, 192 
self-consistent, 542 

see also linearised gravity; Newtonian gravity 
gravity-electromagnetism coupling, 191 
gravity-matter coupling strength, 192 
Gravity Probe B (GP-B), 246 
Green’s functions, 475-8 
Guth model, 438 

gyroscopes, geodesic precession, 244, 246 
slow-rotation limit, 347-50 

Hamilton’s principle, 524-7 
Harrison-Zel’dovich spectrum, 458, 459 
Hawking, Stephen (1942-), 274 
Hawking effect, 274-7 
definition, 275 
Hawking temperature, 276-7 
Heaviside functions, 478 
Heisenberg’s uncertainty principle, 274 
Higgs field, 431, 438, 440 
horizon problem, 419-20, 428 
solving, 437, 442 
hot dark matter (HDM), 387 
Hubble, Edwin (1889-1953), 186-7, 369 
Hubble distance, 420-1 
comoving, 421, 429, 450, 451 
Hubble parameter, 368-71, 390, 392, 407, 

435, 444 
and redshift, 393 
periods when constant, 434-5 
Hubble Space Telescope, 280 
Hubble time, 397, 398, 400 
and age of universe, 408-10 
Hubble’s law, 370 
Hulse, Russell Alan (1950-), 517 
hydrogen, nuclear burning of, 216, 240 
hyperbolae, 21, 268 
invariant, 11-12 

hypersurfaces, 28, 248, 271-2, 477, 483 
non-intersecting spacelike, 356 
hypervolumes, four-dimensional, 476 
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impact parameter, 220-1 
indices 

dummy, 31, 94, 535, 544-6 
free, 30-1 
lowering, 60 
raising, 60 
tensor, 97 
vector, 57, 59-60 
inertial coordinates 
Cartesian, 140, 144 
local, 151-2 
inertial frames, 117 

and principle of relativity, 1-2 
concept of, 1 

dragging of, 312-14, 346, 347, 350 
electric fields in, 141-2 
four-current density in, 136-7 
in standard configuration, 2, 113 
magnetic fields in, 141-2 
transformations between, 6 
see also Cartesian inertial frames; instantaneous 
rest frames 

inertial mass, 148, 149 
infinite redshift surfaces, 315, 324, 419 
infinitesimal general coordinate transformations, 
468, 469-70 
inflation 

amount of, 435-7 
chaotic, 437-8, 440-1 
definition, 428-9 
ending, 435, 440 
new, 437, 438^10 
periods of, 429, 430 
perturbations from, 442 
predictions, 456-7 
starting, 437-8 
stochastic, 438, 441-2 
inflationary cosmology, 420, 428-62 
models, 437 

theory vs. observation, 459-62 
inflationary epoch, 433, 437 
inflaton field, 431-2 

instantaneous rest frame (IRF), 15, 20, 168-9 
and acceleration, 126-7 
definition, 125 
fluid in, 176-7, 178-9 
integration constant as new coordinate, 255 
interferometers, 519 
interval 

and lightcone, 6-8 
infinitesimal, 13-14 
lightlike, 7, 14 
quadratic, 32 
spacelike, 7-8, 14 
timelike, 7-8, 14 


intrinsic derivative, 71-3 
tensor, 107-8 
intrinsic geometry, 33-6 
invariant hyperbolae, 11-12 
inverse transformations, 29-30 
iron, spectral lines of, 240, 243 
isotropic metric 

general static, 196-8 

connection coefficients, 199-200 
stationary, 198 
isotropy of universe, 355 

Jacobian, 29-30, 48 

Kepler’s laws, 277-8 
Kerr, Roy P. (1934-), 321 
Kerr black holes 
binding energy, 338 
extreme, 323 

rotational energy, 325, 327-9 
structure, 322-7 
Kerr geometry, 230, 310-50 
Kerr metric, 243, 246, 317-19, 322 
event horizon, 323 
extension, 327 
limits of, 319-20 
Kerr-Schild form, 321-2 
Kerr solution, 345-6, 347 
frame-dragging effect, 346, 347 
kinetic energy, 535 
Klein-Gordon equation, 536 
covariant, 432 
Kronecker delta, 30 
Kruskal, Martin David (1925-), 266 
Kruskal coordinates, 258, 266-71, 273 
Kruskal extension, 301 
Kruskal spacetime diagrams, 269, 270, 273 

Lagrangian, 209, 525, 535, 536-7, 539 
field, 528-9 
gravitational, 542 
substitution of, 79-80, 199, 205-6 
Lagrangian density, 526, 528, 529-30, 531-2 
in Einstein-Hilbert action, 542-3 
modified, 533 

variational derivative of, 530 
Lagrangian formalism, 524 
Lagrangian procedures, 349 
for geodesics, 78-80, 199, 205 
Laplacian, 148 

four-dimensional, 140 
scalar field, 70, 71 
spatial, 444 
symbols for, 71 

laser Michelson interferometers, 519-20 
Leibnitz’ rule, 526, 542 
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Leibnitz’ theorem, 63, 530, 532, 544, 547 
Lemaitre models, 393-4 
matter-only, 404-6 
spatially flat, 406-7, 410, 419 
properties, 404 
see also de Sitter model 
length 

coordinate, 39 
in manifolds, 38^4-2 
proper, 10, 39 
length contraction, 10-11 
Lens-Thirring effect, 350 
Lie derivative, 447n 

light, bending of, 233-6, 486, see also speed 
of light 
lightcones 

and four-vectors, 115-16 

and intervals, 6-8 

and Schwarzschild solution, 251-2 

at Schwarzschild radius, 257 

future-pointing, 477 

past, 479 

special-relativistic, 218 

line element of Minkowski spacetime, 12-14, see 
also Schwarzschild line element 
linear strain and gravitational waves, 518-19 
linear transformations, 2, 46 
linearised field equations, 467, 470-1, 

487, 490-1 

compact-source approximation, 481-3 
empty-space, 502 
far-field solution, 481-3 
general properties, 473-4 
general solution, 475-80 

multipole expansion for, 480-1 
in vacuo solution, 474-5 
static source, 485-6 
stationary source, 483-5 
linearised general relativity, 467-97 
linearised gravity, 472 

Einstein-Maxwell formulation of, 490-2 
Local Group, 355 
local theories, 50 
longitudinal gauge, 443 
look-back time, 408-10 
Lorentz contraction of fluid element, 177 
Lorentz force law, 491 
gravitational, 492 
Lorentz invariant, 137 

Lorentz symmetry, loss of in general relativity, 487 
Lorentz transformation matrices, 125 
Lorentz transformations, 4, 8, 13, 22, 127, 151 
and four-vectors, 116 
and length contraction, 10 
as four-dimensional rotations, 5-6 
Cartesian coordinates, 112-13 


differentials, 18 
global, 468, 488 
homogeneous, 6, 113 
inertial frames, 313 
inhomogeneous, 6, 113 
Lorentz-boost transformations, 4, 5, 8, 9, 11 
Lorenz gauge conditions, 501, 510 
and electromagnetism, 139^41 
definition, 140 

in arbitrary coordinates, 143—4 
linearised gravity in, 472, 473^1 
satisfying, 478-9, 498-9, 502, 511-12 
luminosity 
absolute, 372 

and gravitational collapse, 263 
gravitational-wave, 514, 515 
luminosity distance, 372^1, 411-12 
Lynden-Bell, Donald (1935-), 280 

Mach’s principle, 149 
magnetic fields in inertial frames, 141-2 
magnetohydrodynamic instabilities, 215 
manifold, 26-52 
arbitrary, 528 
area, 38^4-2 
concept of, 26-7 

coordinate transformations in, 28-30 

coordinates for, 26 

curvature of, 157-8 

differential, 26 

dimensions, 26 

flat, 157, 159 

geometry of, 31 

length of, 38^-2 

local geometry of, 31 

one-dimensional, 527 

pseudo-Euclidean, 111 

scalar fields, 53 

Schwarzschild, 272 

signatures of, 47 

tangent spaces to, 44-5, 47, 54, 59 
tensor calculus on, 92-110 
tensor fields on, 92-3 
topology, 49-50 
torsionless, 65, 76 
two-dimensional, 54-5 
vector calculus, 53-91 
vector fields on, 54-5 
volume, 38-42 

see also pseudo-Riemannian manifolds; 

Riemannian manifolds; submanifolds 
Mars, Viking lander, 239 
masers, 281 
mass function, 278-9 
massive particle 

circular motion, 212-13 
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equatorial, 335-6 
equatorial trajectories, 332-3 
four-momentum, 118-19 
orbits, stability of, 213-17 
radial motion, 209-11 
equatorial initially, 333-5 
trajectories, 304-5 
trajectories, 207-9 
matter 

and general relativity, 545-6 
baryonic, 387, 391 
dark, 387 

energy-momentum tensor, 546 
non-baryonic, 387 

matter-density, 176-7, 387-8, 389, 393 
gravitational, 147-8 
matter-density distribution, quadrupole 
moments, 508 
matter density perturbations 
growing mode, 459 
power spectrum, 458-9 
matter power spectrum, 458 
maximal analytic extension, 301 
maximally symmetric 3-space, 359-61 
Maxwell’s equations, 22-3, 139, 141, 176, 189-90, 
297-8 

gravitational, 491 
homogeneous, 538 
inhomogeneous, 538 
predictions, 498 

see also electromagnetic field equations 
MCG-6-30-15, spectra, 243 
mechanics 

Newtonian, 524-7 
quantum, 259, 274 
relativistic, 122-3 
Mercury 

perihelion shift, 233, 235 
precession, 233 
retardation, 191 
metric components, 83-4 
metric connection, use of term, 66 
metric function, 196 

and affine connection, 65-7 
metric tensor, 32, 93, 96, 112 
Michelson-Morley experiment, 22, 23 
Milky Way Galaxy, 187, 280-1, 355 
Minkowski, Hermann (1864-1909), 4 
Minkowski coordinates, 297-8 
Minkowski geometry, 5, 26, 317, 319-20 
spacetime, 153, 156 
use of term, 4 

Minkowski regions, 269-70 
Minkowski spacetime, 13, 123, 181, 251, 457, 468 
as background, 469, 471, 476n, 484-5, 487, 498 
coordinate transformations, 5 


field theories, 487 
fixed, 473 

four-dimensional, 47, 364 
in arbitrary coordinates, 128-31, 142—4 
in Cartesian coordinates, 111-12 
line element, 12-14 
pseudo-Euclidean, 114, 151 
symmetries, 486-7 
tensorial equations, 135 
weak distortions, 487 
Minkowski 2-space, 266-7 
momentum coordinates, 26 
Moon, eclipses, 235 

motion, equations of, see equations of motion 
M-theory, 271 

N Oph 77, 279 
naked singularities, 301, 324 
National Aeronautics and Space Administration 
(NASA) (US), missions, 246 
neutrinos, 259, 388 
neutron, discovery of, 259 
neutron star, 259-60, 288 
gravitational forces, 260 
in binary system, 277, 278 
Newtonian dynamics, 213, 280 
Newtonian gauge see longitudinal gauge 
Newtonian geometry of space and time, 3 
Newtonian gravity, 147-8, 153, 183, 185, 458 
and equations of motion, 208-9 
and planetary motion, 230 
and tidal forces, 167-8, 264 
field equation, 181-2, 186 
relativistic generalisation, 191 
Newtonian limit, 153-5, 180, 182, 185, 393-4, 509 
and binary systems, 516 
and static sources, 485-6 
Newtonian mechanics, Hamilton’s principle in, 
524-7 

Newtonian potential, 443 

Newtonian theory, 147, 154-5, 180, 181, 183, 230 
and special relativity compared, 2 
of stellar structure, 288 
Newton’s laws of motion, 1 
NGC 4258, 281 
Nobel Prize, 517 
Noether’s theorem, 549 
non-Euclidean geometry, 4 
examples, 36-8 
non-Euclidean space 
infinite, 42 

three-dimensional, 38, 41-2 
non-inertial frames, 129 
normalised scale parameter, 389 
Novikov coordinates, 254n 
null curve, 75-6 
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null-cone, 116 
number densities 

in Friedmann-Robertson-Walker geometry, 
374-6 

proper, 375-6 
observers 

accelerating, 125-8 
in curved spacetime, 152-3 
fundamental, 356-7 

Oppenheimer-Volkoff equation, 293, 294 
Oppenheimer-Volkoff limit, 260 
orbit 

Newtonian, 208 
non-circular, 216-17, 230 
of massive particle, 212-17 
shape of, 208 
spiral, 215, 217 

see also circular orbits; equatorial orbits; photon 
orbits; planetary orbits 
orthogonal connecting vectors, 170 
orthogonal coordinates, 39 
orthonormal basis vectors, 59 

Palatini approach for general relativity in vacuo, 
543-5 

Palatini equation, 541, 544 
parallel transport, 222 
and curvature, 163-5 
and gyroscopes, 244 
of tensor, 108 
of vector, 73-5 
on spherical surface, 165 
path dependence, 74-5 
particle 

charged, 144-5 
four-momentum, 312-13 
infalling, 210-11, 219, 252-9 
non-interacting, 176 
tunnelling, 275 

see also free particle; massive particle 
particle-antiparticle pairs, 274 
particle horizon, 418-20 
particle worldlines, 14-16, 116-17, 154, 156 
radial, in Schwarzschild coordinates, 252^4 
past-pointing vectors, 116 
Pauli exclusion principle, 259 
Penrose, Sir Roger (1931-), 260, 301, 

324, 325 

Penrose process, 327-9, 344 
perfect fluid, 289, 386-9 
and weak-field limit, 184-5 
conservation of energy-momentum, 179-81 
definition, 178-9 

energy-momentum tensors, 178-9, 187, 

376-9, 432 


equations of motion, 180-1 
four-velocity, 290 
perihelion, 230 
shift, 233, 235, 486 

perturbation equations, in Fourier space, 445-6 
perturbations 

from inflation, 442 
gravitational, 502 
Newtonian potential, 443 
scalar-field, evolution, 442-6 
see also curvature perturbations; matter density 
perturbations 
phase transitions, 430-1 
photon, 388 

circular motion, 219 
equatorial, 341 
equation of motion, 119 
four-momentum, 119-20, 222, 223, 241, 243 
four-wave vector, 120-1 
radial motion, 218-19, 302^1 
radially outgoing, 257-8 
redshift, 204-5, 240, 243, 408 
trajectories, 217, 233—4 
equatorial, 338-9 
photon orbits 
circular, 233 

energy equation, 219-20, 236-7 
equatorial, stability of, 342^4 
general, 220 
stability of, 220-1 
photon path deflection, 237-9 
photon propagation, 342-3, 344 
photon worldlines, 14, 218, 242, 256 

radial, in Schwarzschild coordinates, 251-2 
Planck era, 430-1, 436 
Planck scales, 270-1 
plane gravitational waves 

and polarisation states, 498-500 
effects on free particles, 504-7 
propagation, 505 

planetary motion and Newtonian gravity, 231 
planetary orbits 
ellipticity, 230, 231 
precession, 230-3 
Poincare transformations, 6, 7, 113 
Poisson’s equation, 147, 183, 185, 458 
polar coordinates 
cylindrical, 272 
in a plane, 82-7 

polarisation states and plane gravitational waves, 
498-500 

polarisation tensors, linear, 500, 511 
polytropic index, 293 
position coordinates, 26 
potential energy, 535 
potential functions, 443 
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power spectrum 

cosmic microwave background, 459-62 
curvature perturbations, 456-7 
definition, 456 

matter density perturbations, 458-9 
scale invariant, 456 
precession 

geodesic, 244-6 
gyroscopes, 244-6 

slow-rotation limit, 347-50 
planetary orbits, 230-3 
primordial spectral index, 457-2, 458 
principal photon geodesics, equatorial, 339—41 
principle of relativity and inertial frames, 1-2 
principal stresses, 170 
projectiles and elevator experiment, 148-9 
proper angular speed, 245 
proper charge density, 508 
proper density, 509 
proper distance, 371 
proper length, 10, 39 
proper mass density, 508 
proper motion of stars, 280-1 
proper number densities, 375-6 
proper time, 14-16, 204, 206-7, 219, 252-3 
finite, 211, 254 
infinite, 211, 254 
proper volume, 10 
protons, 259 

Schwarzschild radius, 249 
pseudo-Euclidean geometry, see Minkowski 
geometry 

pseudo-Euclidean manifolds, four-dimensional, 111 
pseudo-Euclidean space, 47, 54, 114 
pseudo-Riemannian manifold, 39, 45-7, 53, 54, 

59, 157 

curved spacetime, 150-1 

local Cartesian coordinates, 46-7, 67-8 

non-null curve, 75 

null curve, 75 

use of term, 32 

vectors, 62, 74 

pseudotensors, use of term, 468 
PSR B1913+16 (binary pulsar), 517-18 
pulsars, binary, 516-17 
pulsation, radial, 202 

quadratic intervals, 32 
quadrupole formula, 483, 507-8 
quadrupole-moment tensor, 483, 508, 509-10, 

514, 516 

reduced, 514, 516 

quadrupole moments, transverse-traceless, 514-15 
quantum chromodynamics and inflation, 431 
quantum gravity, theory, 188 
quantum mechanics and white dwarfs, 259, 274 


quark-hadron phase transition, 431 
quasars 

discovery, 279-80 

radio-wave deflection measurements, 236 
quasi-Minkowski coordinates, 467-8, 485 
quasi-stellar objects (QSOs), 279n 
quotient theorem, 103—4 

radar echoes, 236-9 
radial coordinates, 209 
radial distance, 209 
radial motion 

equatorial initially, massive particle, 333-5 
massive particle, 209-11 
photon, 218-19 
radiation density, 388-9, 393 
radiation efficiency in accretion discs, 215-16, 

240, 338 

radio quasars, 235 
radio sources, 235 
rapidities, addition of, 19 
rapidity parameter, 5, 18 

receiver and emitter fixed, gravitational redshift, 
202-5, 315 
recombination, 420 
red giants, 288 
redshift, 187, 191, 395, 411 

and gravitational collapse, 262-3 
and Hubble parameter, 393 
cosmological, 355-6 
infinite, 315, 324, 419 
photon, 204-5, 240, 242, 408 
quasar, 279-80 
see also gravitational redshift 
reheating, 451 
use of term, 435 

Reissner-Nordstrom (RN) black hole, extreme, 301 
Reissner-Nordstrom geometry 
charged black hole, 300-2 
radial massive particle trajectories, 304-5 
radial photon trajectories, 302—4 
spacetime diagram, 303 
relative three-vector, 117 
relativistic aberration 

and Doppler effect, 120-1 
formula, 121 

relativistic collisions and Compton scattering, 
123-5 

relativistic gravitational equations 

static spherically symmetric charged body, 288, 
296-300 

stellar interior, 288-92 
stellar structure, 292-4 
relativistic mechanics, 122-3 
relativistic theories of gravity, 191-3 
r-equation replacement, 206, 217 
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resonant detectors, 518 
retarded time parameter, 258 
Ricci scalar, 250, 539-40 
definition, 161-2 
linearised, 470-1 

Ricci tensor, 182, 199, 202, 359-60, 540 
components, 200, 289-90, 298-9, 317, 378 
definition, 161-2 
linearised, 470-1 
terms, 488-9 

Riemann tensors, see curvature tensors 
Riemannian geometry, 32-3 
Riemannian manifolds, 26, 39, 61 
definition, 32 

local Cartesian coordinates, 42^1 
two-dimensional, 33, 44-5, 53, 311 
conformal flatness, 267, 282-3 
vectors, 74 

see also pseudo-Riemannian manifolds 
rotating bodies 

characterisation, 310 
slow, 347-50 

spacetime geometry, 310-50 

scalar density, 529 
scalar field, 430-1, 527-9 
as cosmological fluid, 431-2 
energy-momentum tensor, 432, 444-5 
equations of motion, 433 
field theories, 534—6 
gradient, 70 
Higgs-like, 438 
Laplacian, 70, 71 
on manifolds, 53 
quantum irregularities, 442 
reheating, 435, 451 
scalar multiplication, tensors, 98 
scalar parameters, 431 
scalar product 
positive definite, 61 
vectors, 58 

scalar-tensor theory of gravity, 191-2 
scalar theory of gravity, 191 
scalar, covariant derivatives, 69-70, see also Ricci 
scalar 

scale factor, 376-80 
evolution of, 397^-00 
scale fluctuations, super-horizon, 451 
scale invariance, 456 
scale-invariant spectrum, 459 
scaling factors, conformal, 266-7 
Schmidt, Maarten (1929-), 279-80 
Schwarzschild, Karl (1873-1916), 196 
Schwarzschild black holes, 202, 240, 248-83, 288 
formation of, 260-3, 296 
Schwarzschild constant-density solution, 296 


for stellar interior, 294-5 
Schwarzschild coordinates, 248, 250, 261, 262, 
264, 268 

radial particle worldlines in, 252—4 
radial photon worldlines in, 251-2 
timelike, 266 

Schwarzschild geometry, 196-224, 233-4, 248, 
288, 301 

and experimental tests, 230 
geodesics in, 205-7 
in Kruskal coordinates, 266-71 
spacetime diagram, 256-7 
static, 272 
tidal forces in, 264 

Schwarzschild line element, 204, 205, 255, 258 
derivation, 198-201 
Schwarzschild manifold, 272 
Schwarzschild metric, 211, 240, 242-3, 266, 
277, 443 

connection coefficients, 244 
singularities in, 249-50 
spherical symmetry, 206, 289, 292, 296 
validity, 201-2 

Schwarzschild radius, 202, 249, 251, 252 
lightcone structure at, 257 
Schwarzschild solution, 260, 486 
lightcone structure of, 251-2 
maximal extension, 270 
Schwarzschild spacetime, 202-3 
Shapiro effect, 486 
short X-ray transients, 279 
sign conventions, in general relativity, 193 
signatures, of manifolds, 47 
simultaneity, concept of, 9 
singularities, 38, 269-71 
black-hole, 259, 271 
coordinate, 37, 250, 341 
intrinsic, 250, 288 
naked, 301, 324 
real, 252 
ring, 327 

in Schwarzschild metric, 249-50 
spacelike, 252, 345-6 
timelike, 345-6 
white-hole, 258, 270 
singularity theorems, 260 
Sirius, 259 
Sirius B, 259 

slow-roll approximation, 434-5, 440 
source term in field equations, 136, 176 
space 

empty, 183 

pseudo-Euclidean, 47, 54, 114 
with Newtonian geometry, 3 
see also Euclidean space; Fourier space; 
non-Euclidean space 
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spacetime, 26 
empty, 184 
four-dimensional, 158 
geometry of, 240 
Minkowski geometry, 153, 156 
of special relativity, 1-25 
paths in, 13-14 
rotations in, 127 
Schwarzschild, 202-3 
static, 196, 273 
stationary, 196, 275, 315 
symmetries, 196 

see also curved spacetime; Minkowski spacetime 
spacetime curvature, 181, 190, 250 
and energy-momentum tensors, 176 
gravity as, 150-1 
see also curved spacetime 
spacetime diagrams, 7, 8-9, 258, 261, 267 
Kerr solution, 346 
Reissner-Nordstrom geometry, 303 
Schwarzschild geometry, 256-7 
see also Kruskal spacetime diagrams 
spacetime geometry 
dynamics, 376 
of special relativity, 3-5 
rotating bodies, 310-50 
spacetime indices, 528 
spacetime torsion, 193 
spatial amplitude tensor, 503 
spatial curvature 
evolution, 417-18 
negative, 364-5, 391 
positive, 363-4, 391 
zero, 364 

spatial momentum, 275 
spatial projection tensors, 503 
spatial velocity fields, 484 
special relativity, 111-13 
and electromagnetism, 135 
and elevator experiment, 149-50 
and Newtonian theory compared, 2 
acceleration in, 19-20 
Einstein’s route to, 22-3 
event horizon in, 21-2 
spacetime geometry of, 3-5 
spacetime of, 1-25 
velocity addition, 18-19 
spectrum, 240, 243 
curvature, 456 

Harrison-Zel’dovich, 458, 459 
matter power, 458 
see also power spectrum 
speed of light, 14, 23 
constant, 3^1 
spherical mass, 486 

and gravitational redshift, 202-5 


light deflection, 235, 236 
Schwarzschild metric, 201 
spherical surfaces, 36-7, 171-2 
curvature tensor, 161, 171 
four-dimensional, 37-8 
geodesic convergence, 167 
parallel transport, 165 
three-dimensional, 35, 40-2 
two-dimensional, 35 
vector field, 54 

spherical symmetry, 202, 206, 288-305, 310 
spherically symmetric collapse, 260-4 
spin, quantum mechanical, 192-3 
spontaneous symmetry breaking, 431 
stars 

age of, 410 

gravitational collapse, 259-60, 261-3 
maximum mass, 260 
proper motion of, 280-1 
radial pulsation, 202 
velocity dispersion, 280 

see also binary system; neutron star; white dwarf 
static metric, 196-7 
static source 

and Newtonian limit, 485-6 
non-relativistic, 484 

static spherically symmetric charged body, 

relativistic gravitational equations, 288, 
296-300 

stationary axisymmetric metric, general, 310-12 
stationary limit surface, 314-15, 324 
stationary source, 483-5 
Stefan-Boltzmann constant, 277, 388 
stellar interior 

relativistic gravitational equations, 288-92 
Schwarzschild constant-density solution, 294-5 
stellar structure 
Newtonian theory of, 288 
relativistic gravitational equations, 292—4 
stress-energy tensor, see energy-momentum tensor 
submanifold, 28 

integration over, 47-9 
subtraction, tensor, 98 
summation convention, 30-1 
Sun 

corona, 239 
eclipses, 235 

gravitational collapse, 259 
gravitational redshift, 486 
and light bending, 233-6 
photon path deflection, 237-9 
Schwarzschild radius, 249 
super-horizon scale fluctuations, 451 
supermassive black holes, 265, 279-82 
existence, 280 
potential, 282 
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supemovae, 188 

superstring theory (M-theory), 271 
surfaces in manifolds, 27-8 
infinite redshift surfaces, 315, 325, 419 
parametric, 28 

stationary limit, 314-15, 324 
three-surfaces, 315-16 
two-surfaces, 272 

see also hypersurfaces; spherical surfaces 

tangent space to manifold, 44-5, 47, 54, 59 
tangent vectors, 57, 76, 80, 123, 248 
covariant components, 81 
as directional derivative, 81-2 
length, 75 
to curve, 55-6 

Taylor, Joseph Hooton, Jr (1941—), 517 
tensor calculus on manifolds, 92-110 
tensor equations, 102-3 
tensor fields, 487, 528 
on manifolds, 92-3 
symmetric, 498 
tensor product, 98-9 
tensorial equation, 135 
tensorial operations 
definition, 98 
elementary, 98-100 
tensors 

addition, 98 

amplitude, 499-500 

and coordinate transformations, 101-2 

arbitrary, 104 

as geometrical objects, 100-1 
basis, 100 

components, 93^-, 100, 102, 103^f, 112 

concept of, 92 

contraction, 99-100 

coordinates, 102-3 

covariant derivatives, 104-7 

definition, 93 

field-strength, 537, 538 

four-tensors, 136, 152, 250 

gravitational field, 514 

indices, 97 

inner product, 99-100 
intrinsic derivatives, 107-8 
linear polarisation, 500, 511 
mapping, 97-8 
metric, 96 
outer product, 98-9 
parallel transport, 108 
rank of, 93 
rank-1, 93 

rank-2, 94, 95, 100, 105, 177, 182 
definition, 98-9 
rank-3, 99 


scalar multiplication, 98 
spatial, 503 
subtraction, 98 
symmetries, 94-6 
tidal stress, 168 
torsion, 65 
zero-rank, 93 

see also curvature tensor; electromagnetic field 
tensor; energy-momentum tensor; metric 
tensor; quadrupole-moment tensor 
tetrads, 125, 126, 127, 152 
threading for spacetime, 356-7 
three-acceleration, 123 
three-force, 122 

electromagnetic, 147 
three-momentum, 119 
three-space, 272 

see also maximally symmetric 3-space 
three-space vectors, 127 
three-spheres, 37-8 
three-surfaces, null, 315-16 
three-vector potential, 297-8 
three-vectors, 130, 135, 138, 141 
relative, 117 
unit, 120 

three-velocity, 118 
spatial, 290 

three-wavevectors, 499 
tidal forces, 149 

and Newtonian gravity, 167-8, 264 
gravitational, 167 
in binary systems, 277 
in curved spacetime, 167-70 
in Schwarzschild geometry, 264 
near black holes, 264-5 
tidal stress tensor, 168 
time 

cosmic, 419 
look-back, 408-10 
Newtonian geometry, 3 
retarded, 478 ,479 
see also Hubble time; proper time; 
spacetime 

time dilation, 10, 11 

in weak gravitational field, 155 
timelike curves, closed, 301, 327 
topology of manifolds, 49-50 
torsion tensor, 65 
torsion theories, 192-3 
tortoise coordinate, 266 
total density parameter, 392 
trajectories 

of infalling particle of, 210, 218, 279, 

252-9 

of massive particle, 207-9 
of photon, 217, 233-4 
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radial 

of massive particles, 304-5 
of photons, 302—4 
see also equatorial trajectories 
transformation matrices, 29 
transformations 
Galilean, 3, 4 
gauge, 140, 472 
inverse, 29-30 
Jacobian of, 29-30 
linear, 2, 46 
Poincare, 6, 7, 113 

see also coordinate transformations; Lorentz 
transformation 

transverse-traceless (TT) gauge, 500, 504, 505 
transformation into, 502-3, 510-11 
tunnelling of particles, 275 
turbulent viscosity, 215 
two-spheres, 34-5, 41-2, 250 
two-surfaces, 272 

ultra-stiff equations, 294 
uncertainty principle, 274, 276 
unified electro weak theory, 431 
unit vectors, 62 
timelike, 126 
universe 

and Friedmann-Robertson-Walker geometry, 
355-81 

acceleration, 188 
age of, 398, 400, 408-10 
collapse, 398 
dynamics, 401 
energy density of, 390, 433 
expansion, 367, 420 
acceleration phase, 429-30 
deceleration phase, 428-9 
vs. contraction, 186, 188 
general dynamical behaviour, 393-7 
geometry, 401 
homogeneity, 355-6 
isotropy, 355-6 
radiation energy density, 388 
static models, 186, 407 
structural origins, 442 

V404 Cyg, 279 
vacuum 

energy density of, 187, 386, 389, 393 
energy-momentum tensor, 187-8 
models, 389 
true, 438 

variational derivative, 530 
variational principles 

electromagnetism from, 536-9 
and general relativity, 524-49 


variations, calculus of, 87-8 
vector calculus, on manifolds, 53-91 
vector fields, 92-3, 191, 527-8 
components, 73 

contravariant components, 56, 57 
covariant components, 57 
curl, 71 
divergence, 70 
on manifold, 54-5 
parallel, 73 

vector operator, component form, 

70-1 

vectors 

angle between, 62 
as directional derivatives, 81-2 
as linear function, 92 
concept of, 53 
derivatives 

covariant, 68-70 
intrinsic, 71-3 
future-pointing, 116 
indices, 57, 59-60 
length, 62 
local, 54, 56 
null, 62, 115, 116, 475 
orthogonal, 62, 170 
parallel transport, 73-5 
past-pointing, 116 

properties, coordinate-independent, 61-2 

reciprocal systems, 57 

scalar product, 58 

spacelike, 115, 116, 152 

tangent, 55-6 

three-space, 127 

timelike, 115, 116, 152, 153 

zero-length, 61-2 

see also basis vectors; four-vectors; tangent 
vectors; three-vectors; unit vectors; 
wavevectors 

velocity addition in special relativity, 18-19 
Venus, photon path deflection, 237-9 
Venus-Earth time-delay measurements, 239 
very long baseline interferometry (VLBI), 235 
maser studies, 281 
Very Small Array (VSA), 460 
Viking lander, 239 
Virgo, 355 
volume 

finite or infinite, of non-Euclidean 
three-dimensional spaces, 42 
in Friedmann-Robertson-Walker geometry, 
374-6 

manifolds, 38^42 
proper, 10 

volume-redshift relation, 413-14 
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wave equations, 474, 475, 498 
wavevectors 
comoving, 446 
see also four-wavevectors 
weak-field metric, 467-70 
Weber, Joseph (1919-2000), 518 
Weyl’s postulate, 356-7 
white dwarf, 155, 288 

electron degeneracy pressure in, 259 
in binary system, 277, 278 
white hole 
definition, 258 
existence of, 258-9, 270 
singularities, 258, 270 
see also black hole; wormhole 
Wilkinson Microwave Anisotropy Probe 
(WMAP), 460 


worldlines, 9, 12, 123, 125-6 
dust, 190 

fixed emitter and receiver, 204 
in arbitrary coordinates, 144 
in curved spacetime, 150, 152 
of centre of mass, 170 
of fundamental observers (galaxies), 356a 
see also particle worldlines; photon 
worldlines 
wormhole, 270 

and Einstein-Rosen bridge, 271^1 
dynamic structure, 272 

X-ray telescope, 260 
X-rays, emitters of, 240 

zero-rank tensor, 93 



